IT Department CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Installation amp Configuration
ldquoNagios 30 with Open Suse Linux 1020rdquo
IT Department Page 2
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 NAGIOS CIST PHNOM PENH PROJECT 5
11 Introduction 5
12 Requirements 5
2 INSTALLATION LINUX SUSE 102 5
21 Setup steps 5 211 Starting with Hardware Configuration 5 212 Starting with Linux Suse 102 Installation 5
3 NAGIOS INSTALLATION 7
31 Introduction 7
32 Required Packages 7
33 Create Account Information 8
34 Dowload Nagios and the Plugins 8
35 Compile and Install Nagios 8
36 Customize Configuration 9
37 Configure the Web interface 9
38 Compile and Install the Nagios Plugins 9
39 Start Nagios 9
310 Login to the Web Interface 10
311 Other Modifications 10
4 NAGIOS DOCUMENTATIONS 10
5 THANKS 11
6 NAGIOS FILES 12
61 Introduction 12
7 BASICS OF NAGIOS 12
71 Description 12
IT Department Page 3
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
72 What Are Objetcs 12
73 Where Are Objects Defined 13
74 How Are Objects Defined 13
75 Objects Explained 13
8 CONFIGURATIONS FILES FOR CIST 15
81 Location 15
82 Backup the Configuration Files 15
83 Nagioscfg 15
84 Localhostcfg 17
85 Explanations of localhost file and services 27 851 Creating A Host Definition 27 852 Monitoring HTTP 28
86 Others cfg files switchcfg printercfghellip 29
9 MONITORING WINDOWS MACHINES 29
91 Introduction 29
92 Installing the Windows Agent 29
93 Nagios Host Configuration 32
94 Monitoring Services 33
95 Monitoring NSClient++ Version 34
96 Monitoring Uptime 34
97 Monitoring Cpu Load 34
98 Monitoring Memory Usage 34
99 Monitoring Disk Usage 35
910 Monitoring A Windows Service 35
911 Monitoring A Windows Process 35
10 STATUSMAP 36
101 How to have a smoothly map 36
102 Add Changing Icons 37 1021 Icon image 37
IT Department Page 4
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image 38 1023 Statusmap_image 38
11 CIST MONITORED HOSTS 40
IT Department Page 5
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Nagios CIST Phnom Penh project
11 Introduction
- Nagios is a monitoring tools based on Linux (GNU license) it permits to monitor and create a map of all network object in IT environment
12 Requirements
- Nagios will be installed on a PTC Computer (Intel Plateform P4 28 Ghz 2Go Ram) - This machine is located in the Cist IT Servers Room - The installation will be done in RAID1 architecture to be more efficient
Nota Nagios is a Monitoring Server and has no more impact on the production The scanning of network object with the bandwith So the network will not be improved by Nagios The hosts which are using a plugin client will no more be affected by the plugin
2 Installation Linux Suse 102
21 Setup steps
Installation with 2 Hard Drives in RAID 1 Mode
211 Starting with Hardware Configuration
We assume that PCServer has 2 Hard Drives (SATA) Check in your BIOS that the both drives are recnonized (award bios ldquodelrdquo to enter ldquoSetuprdquo -gtrdquoStandrad Cmos Setuprdquo) Assume that ldquoFirst Boot Devicerdquo is CDDVD to boot frm LINUX Suse 102 DVD
212 Starting with Linux Suse 102 Installation
Raid Installation(partitioning)
Insert DVD ldquoSuse Linux 102 ISOrdquo and Power on the PCServer
First Menu OpenSUSE -gt Installation Language -gt Yes-gtEnglish US(default) Installation mode -gtNew Installation Clock Time Zone -gtAsia ndash Phnom Penh ndashLocal Time(hardware clock)
IT Department Page 6
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Desktop Selection -gtGnome Installation Settings -gtPartionning-gtCreate Custom Partition Setup-gtCustom Partitionning (for Experts) (new documention InstallRAID1odt) On 80 GB hard drive Swap 2G0 boot about 1GB about 53GB home about 20GB Software Installation(Softawre) Just after the RAID select ldquoSoftwarerdquo to add additionnals for Nagios Check ldquodetailsrdquo on bottom-gtin Filter menu-gtSearch-gtapache2-gtselectrdquo apache 2rdquo (28mb) -gtSearch-gtgcc-gtselectrdquogcc the system GNU C Compilerrdquo-gtaccept Agree-gtInstall -gtAutomatic Reboot
Password for System Administrator -gt yoursquore invited to enter ldquoroot passwordrdquo Hostname and Domain Name -gtPursat and cistlan
-gtuncheck ldquoChange Hostname via Dhcprdquo Network Configuration -gtenable SSH in Firewall menu-gtssh open Network Interface -gtclisk on Network Interface-gtedit-gt StaticIp Adress-gt102030442552552550
-gtHostname and Name Server -gtName Server1 and 2=1020304310203040 -gtRouting-gt10203020
-gtNext to validate your configuration
Test Internet Connection -gtYes normally a response appears withrdquoSuccesrdquo Novell Customer Center Configuration -gtConfigure Later Additional installation Sources- -gtNo User Authentication Mrthod -gtcheck Local(etcpasswd New Local User -gtfull namenagios administrator Username nagios
Nagiosadmin and password Hardware Configuration -gtcheck your configuration -gtnext Installation Completed -gtFinish
IT Department Page 7
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
3 Nagios Installation This has been taken from wwwnagiosorg httpnagiossourceforgenetdocs3_0quickstart-opensusehtml
31 Introduction
his guide is intended to provide you with simple instructions on how to install Nagios from source (code) on openSUSE and have it monitoring your local machine inside of 20 minutes No advanced installation options are discussed here - just the basics that will work for 95 of users who want to get started These instructions were written based on an openSUSE 102 installation
32 Required Packages
Make sure youve installed the following packages on your openSUSE installation before continuing You can use yast to install packages under openSUSE apache2 CC++ development libraries Check if you have internet access open a terminal session Nslookup wwwgooglecom
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 2
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 NAGIOS CIST PHNOM PENH PROJECT 5
11 Introduction 5
12 Requirements 5
2 INSTALLATION LINUX SUSE 102 5
21 Setup steps 5 211 Starting with Hardware Configuration 5 212 Starting with Linux Suse 102 Installation 5
3 NAGIOS INSTALLATION 7
31 Introduction 7
32 Required Packages 7
33 Create Account Information 8
34 Dowload Nagios and the Plugins 8
35 Compile and Install Nagios 8
36 Customize Configuration 9
37 Configure the Web interface 9
38 Compile and Install the Nagios Plugins 9
39 Start Nagios 9
310 Login to the Web Interface 10
311 Other Modifications 10
4 NAGIOS DOCUMENTATIONS 10
5 THANKS 11
6 NAGIOS FILES 12
61 Introduction 12
7 BASICS OF NAGIOS 12
71 Description 12
IT Department Page 3
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
72 What Are Objetcs 12
73 Where Are Objects Defined 13
74 How Are Objects Defined 13
75 Objects Explained 13
8 CONFIGURATIONS FILES FOR CIST 15
81 Location 15
82 Backup the Configuration Files 15
83 Nagioscfg 15
84 Localhostcfg 17
85 Explanations of localhost file and services 27 851 Creating A Host Definition 27 852 Monitoring HTTP 28
86 Others cfg files switchcfg printercfghellip 29
9 MONITORING WINDOWS MACHINES 29
91 Introduction 29
92 Installing the Windows Agent 29
93 Nagios Host Configuration 32
94 Monitoring Services 33
95 Monitoring NSClient++ Version 34
96 Monitoring Uptime 34
97 Monitoring Cpu Load 34
98 Monitoring Memory Usage 34
99 Monitoring Disk Usage 35
910 Monitoring A Windows Service 35
911 Monitoring A Windows Process 35
10 STATUSMAP 36
101 How to have a smoothly map 36
102 Add Changing Icons 37 1021 Icon image 37
IT Department Page 4
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image 38 1023 Statusmap_image 38
11 CIST MONITORED HOSTS 40
IT Department Page 5
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Nagios CIST Phnom Penh project
11 Introduction
- Nagios is a monitoring tools based on Linux (GNU license) it permits to monitor and create a map of all network object in IT environment
12 Requirements
- Nagios will be installed on a PTC Computer (Intel Plateform P4 28 Ghz 2Go Ram) - This machine is located in the Cist IT Servers Room - The installation will be done in RAID1 architecture to be more efficient
Nota Nagios is a Monitoring Server and has no more impact on the production The scanning of network object with the bandwith So the network will not be improved by Nagios The hosts which are using a plugin client will no more be affected by the plugin
2 Installation Linux Suse 102
21 Setup steps
Installation with 2 Hard Drives in RAID 1 Mode
211 Starting with Hardware Configuration
We assume that PCServer has 2 Hard Drives (SATA) Check in your BIOS that the both drives are recnonized (award bios ldquodelrdquo to enter ldquoSetuprdquo -gtrdquoStandrad Cmos Setuprdquo) Assume that ldquoFirst Boot Devicerdquo is CDDVD to boot frm LINUX Suse 102 DVD
212 Starting with Linux Suse 102 Installation
Raid Installation(partitioning)
Insert DVD ldquoSuse Linux 102 ISOrdquo and Power on the PCServer
First Menu OpenSUSE -gt Installation Language -gt Yes-gtEnglish US(default) Installation mode -gtNew Installation Clock Time Zone -gtAsia ndash Phnom Penh ndashLocal Time(hardware clock)
IT Department Page 6
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Desktop Selection -gtGnome Installation Settings -gtPartionning-gtCreate Custom Partition Setup-gtCustom Partitionning (for Experts) (new documention InstallRAID1odt) On 80 GB hard drive Swap 2G0 boot about 1GB about 53GB home about 20GB Software Installation(Softawre) Just after the RAID select ldquoSoftwarerdquo to add additionnals for Nagios Check ldquodetailsrdquo on bottom-gtin Filter menu-gtSearch-gtapache2-gtselectrdquo apache 2rdquo (28mb) -gtSearch-gtgcc-gtselectrdquogcc the system GNU C Compilerrdquo-gtaccept Agree-gtInstall -gtAutomatic Reboot
Password for System Administrator -gt yoursquore invited to enter ldquoroot passwordrdquo Hostname and Domain Name -gtPursat and cistlan
-gtuncheck ldquoChange Hostname via Dhcprdquo Network Configuration -gtenable SSH in Firewall menu-gtssh open Network Interface -gtclisk on Network Interface-gtedit-gt StaticIp Adress-gt102030442552552550
-gtHostname and Name Server -gtName Server1 and 2=1020304310203040 -gtRouting-gt10203020
-gtNext to validate your configuration
Test Internet Connection -gtYes normally a response appears withrdquoSuccesrdquo Novell Customer Center Configuration -gtConfigure Later Additional installation Sources- -gtNo User Authentication Mrthod -gtcheck Local(etcpasswd New Local User -gtfull namenagios administrator Username nagios
Nagiosadmin and password Hardware Configuration -gtcheck your configuration -gtnext Installation Completed -gtFinish
IT Department Page 7
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
3 Nagios Installation This has been taken from wwwnagiosorg httpnagiossourceforgenetdocs3_0quickstart-opensusehtml
31 Introduction
his guide is intended to provide you with simple instructions on how to install Nagios from source (code) on openSUSE and have it monitoring your local machine inside of 20 minutes No advanced installation options are discussed here - just the basics that will work for 95 of users who want to get started These instructions were written based on an openSUSE 102 installation
32 Required Packages
Make sure youve installed the following packages on your openSUSE installation before continuing You can use yast to install packages under openSUSE apache2 CC++ development libraries Check if you have internet access open a terminal session Nslookup wwwgooglecom
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 3
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
72 What Are Objetcs 12
73 Where Are Objects Defined 13
74 How Are Objects Defined 13
75 Objects Explained 13
8 CONFIGURATIONS FILES FOR CIST 15
81 Location 15
82 Backup the Configuration Files 15
83 Nagioscfg 15
84 Localhostcfg 17
85 Explanations of localhost file and services 27 851 Creating A Host Definition 27 852 Monitoring HTTP 28
86 Others cfg files switchcfg printercfghellip 29
9 MONITORING WINDOWS MACHINES 29
91 Introduction 29
92 Installing the Windows Agent 29
93 Nagios Host Configuration 32
94 Monitoring Services 33
95 Monitoring NSClient++ Version 34
96 Monitoring Uptime 34
97 Monitoring Cpu Load 34
98 Monitoring Memory Usage 34
99 Monitoring Disk Usage 35
910 Monitoring A Windows Service 35
911 Monitoring A Windows Process 35
10 STATUSMAP 36
101 How to have a smoothly map 36
102 Add Changing Icons 37 1021 Icon image 37
IT Department Page 4
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image 38 1023 Statusmap_image 38
11 CIST MONITORED HOSTS 40
IT Department Page 5
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Nagios CIST Phnom Penh project
11 Introduction
- Nagios is a monitoring tools based on Linux (GNU license) it permits to monitor and create a map of all network object in IT environment
12 Requirements
- Nagios will be installed on a PTC Computer (Intel Plateform P4 28 Ghz 2Go Ram) - This machine is located in the Cist IT Servers Room - The installation will be done in RAID1 architecture to be more efficient
Nota Nagios is a Monitoring Server and has no more impact on the production The scanning of network object with the bandwith So the network will not be improved by Nagios The hosts which are using a plugin client will no more be affected by the plugin
2 Installation Linux Suse 102
21 Setup steps
Installation with 2 Hard Drives in RAID 1 Mode
211 Starting with Hardware Configuration
We assume that PCServer has 2 Hard Drives (SATA) Check in your BIOS that the both drives are recnonized (award bios ldquodelrdquo to enter ldquoSetuprdquo -gtrdquoStandrad Cmos Setuprdquo) Assume that ldquoFirst Boot Devicerdquo is CDDVD to boot frm LINUX Suse 102 DVD
212 Starting with Linux Suse 102 Installation
Raid Installation(partitioning)
Insert DVD ldquoSuse Linux 102 ISOrdquo and Power on the PCServer
First Menu OpenSUSE -gt Installation Language -gt Yes-gtEnglish US(default) Installation mode -gtNew Installation Clock Time Zone -gtAsia ndash Phnom Penh ndashLocal Time(hardware clock)
IT Department Page 6
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Desktop Selection -gtGnome Installation Settings -gtPartionning-gtCreate Custom Partition Setup-gtCustom Partitionning (for Experts) (new documention InstallRAID1odt) On 80 GB hard drive Swap 2G0 boot about 1GB about 53GB home about 20GB Software Installation(Softawre) Just after the RAID select ldquoSoftwarerdquo to add additionnals for Nagios Check ldquodetailsrdquo on bottom-gtin Filter menu-gtSearch-gtapache2-gtselectrdquo apache 2rdquo (28mb) -gtSearch-gtgcc-gtselectrdquogcc the system GNU C Compilerrdquo-gtaccept Agree-gtInstall -gtAutomatic Reboot
Password for System Administrator -gt yoursquore invited to enter ldquoroot passwordrdquo Hostname and Domain Name -gtPursat and cistlan
-gtuncheck ldquoChange Hostname via Dhcprdquo Network Configuration -gtenable SSH in Firewall menu-gtssh open Network Interface -gtclisk on Network Interface-gtedit-gt StaticIp Adress-gt102030442552552550
-gtHostname and Name Server -gtName Server1 and 2=1020304310203040 -gtRouting-gt10203020
-gtNext to validate your configuration
Test Internet Connection -gtYes normally a response appears withrdquoSuccesrdquo Novell Customer Center Configuration -gtConfigure Later Additional installation Sources- -gtNo User Authentication Mrthod -gtcheck Local(etcpasswd New Local User -gtfull namenagios administrator Username nagios
Nagiosadmin and password Hardware Configuration -gtcheck your configuration -gtnext Installation Completed -gtFinish
IT Department Page 7
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
3 Nagios Installation This has been taken from wwwnagiosorg httpnagiossourceforgenetdocs3_0quickstart-opensusehtml
31 Introduction
his guide is intended to provide you with simple instructions on how to install Nagios from source (code) on openSUSE and have it monitoring your local machine inside of 20 minutes No advanced installation options are discussed here - just the basics that will work for 95 of users who want to get started These instructions were written based on an openSUSE 102 installation
32 Required Packages
Make sure youve installed the following packages on your openSUSE installation before continuing You can use yast to install packages under openSUSE apache2 CC++ development libraries Check if you have internet access open a terminal session Nslookup wwwgooglecom
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 4
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image 38 1023 Statusmap_image 38
11 CIST MONITORED HOSTS 40
IT Department Page 5
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Nagios CIST Phnom Penh project
11 Introduction
- Nagios is a monitoring tools based on Linux (GNU license) it permits to monitor and create a map of all network object in IT environment
12 Requirements
- Nagios will be installed on a PTC Computer (Intel Plateform P4 28 Ghz 2Go Ram) - This machine is located in the Cist IT Servers Room - The installation will be done in RAID1 architecture to be more efficient
Nota Nagios is a Monitoring Server and has no more impact on the production The scanning of network object with the bandwith So the network will not be improved by Nagios The hosts which are using a plugin client will no more be affected by the plugin
2 Installation Linux Suse 102
21 Setup steps
Installation with 2 Hard Drives in RAID 1 Mode
211 Starting with Hardware Configuration
We assume that PCServer has 2 Hard Drives (SATA) Check in your BIOS that the both drives are recnonized (award bios ldquodelrdquo to enter ldquoSetuprdquo -gtrdquoStandrad Cmos Setuprdquo) Assume that ldquoFirst Boot Devicerdquo is CDDVD to boot frm LINUX Suse 102 DVD
212 Starting with Linux Suse 102 Installation
Raid Installation(partitioning)
Insert DVD ldquoSuse Linux 102 ISOrdquo and Power on the PCServer
First Menu OpenSUSE -gt Installation Language -gt Yes-gtEnglish US(default) Installation mode -gtNew Installation Clock Time Zone -gtAsia ndash Phnom Penh ndashLocal Time(hardware clock)
IT Department Page 6
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Desktop Selection -gtGnome Installation Settings -gtPartionning-gtCreate Custom Partition Setup-gtCustom Partitionning (for Experts) (new documention InstallRAID1odt) On 80 GB hard drive Swap 2G0 boot about 1GB about 53GB home about 20GB Software Installation(Softawre) Just after the RAID select ldquoSoftwarerdquo to add additionnals for Nagios Check ldquodetailsrdquo on bottom-gtin Filter menu-gtSearch-gtapache2-gtselectrdquo apache 2rdquo (28mb) -gtSearch-gtgcc-gtselectrdquogcc the system GNU C Compilerrdquo-gtaccept Agree-gtInstall -gtAutomatic Reboot
Password for System Administrator -gt yoursquore invited to enter ldquoroot passwordrdquo Hostname and Domain Name -gtPursat and cistlan
-gtuncheck ldquoChange Hostname via Dhcprdquo Network Configuration -gtenable SSH in Firewall menu-gtssh open Network Interface -gtclisk on Network Interface-gtedit-gt StaticIp Adress-gt102030442552552550
-gtHostname and Name Server -gtName Server1 and 2=1020304310203040 -gtRouting-gt10203020
-gtNext to validate your configuration
Test Internet Connection -gtYes normally a response appears withrdquoSuccesrdquo Novell Customer Center Configuration -gtConfigure Later Additional installation Sources- -gtNo User Authentication Mrthod -gtcheck Local(etcpasswd New Local User -gtfull namenagios administrator Username nagios
Nagiosadmin and password Hardware Configuration -gtcheck your configuration -gtnext Installation Completed -gtFinish
IT Department Page 7
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
3 Nagios Installation This has been taken from wwwnagiosorg httpnagiossourceforgenetdocs3_0quickstart-opensusehtml
31 Introduction
his guide is intended to provide you with simple instructions on how to install Nagios from source (code) on openSUSE and have it monitoring your local machine inside of 20 minutes No advanced installation options are discussed here - just the basics that will work for 95 of users who want to get started These instructions were written based on an openSUSE 102 installation
32 Required Packages
Make sure youve installed the following packages on your openSUSE installation before continuing You can use yast to install packages under openSUSE apache2 CC++ development libraries Check if you have internet access open a terminal session Nslookup wwwgooglecom
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 5
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Nagios CIST Phnom Penh project
11 Introduction
- Nagios is a monitoring tools based on Linux (GNU license) it permits to monitor and create a map of all network object in IT environment
12 Requirements
- Nagios will be installed on a PTC Computer (Intel Plateform P4 28 Ghz 2Go Ram) - This machine is located in the Cist IT Servers Room - The installation will be done in RAID1 architecture to be more efficient
Nota Nagios is a Monitoring Server and has no more impact on the production The scanning of network object with the bandwith So the network will not be improved by Nagios The hosts which are using a plugin client will no more be affected by the plugin
2 Installation Linux Suse 102
21 Setup steps
Installation with 2 Hard Drives in RAID 1 Mode
211 Starting with Hardware Configuration
We assume that PCServer has 2 Hard Drives (SATA) Check in your BIOS that the both drives are recnonized (award bios ldquodelrdquo to enter ldquoSetuprdquo -gtrdquoStandrad Cmos Setuprdquo) Assume that ldquoFirst Boot Devicerdquo is CDDVD to boot frm LINUX Suse 102 DVD
212 Starting with Linux Suse 102 Installation
Raid Installation(partitioning)
Insert DVD ldquoSuse Linux 102 ISOrdquo and Power on the PCServer
First Menu OpenSUSE -gt Installation Language -gt Yes-gtEnglish US(default) Installation mode -gtNew Installation Clock Time Zone -gtAsia ndash Phnom Penh ndashLocal Time(hardware clock)
IT Department Page 6
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Desktop Selection -gtGnome Installation Settings -gtPartionning-gtCreate Custom Partition Setup-gtCustom Partitionning (for Experts) (new documention InstallRAID1odt) On 80 GB hard drive Swap 2G0 boot about 1GB about 53GB home about 20GB Software Installation(Softawre) Just after the RAID select ldquoSoftwarerdquo to add additionnals for Nagios Check ldquodetailsrdquo on bottom-gtin Filter menu-gtSearch-gtapache2-gtselectrdquo apache 2rdquo (28mb) -gtSearch-gtgcc-gtselectrdquogcc the system GNU C Compilerrdquo-gtaccept Agree-gtInstall -gtAutomatic Reboot
Password for System Administrator -gt yoursquore invited to enter ldquoroot passwordrdquo Hostname and Domain Name -gtPursat and cistlan
-gtuncheck ldquoChange Hostname via Dhcprdquo Network Configuration -gtenable SSH in Firewall menu-gtssh open Network Interface -gtclisk on Network Interface-gtedit-gt StaticIp Adress-gt102030442552552550
-gtHostname and Name Server -gtName Server1 and 2=1020304310203040 -gtRouting-gt10203020
-gtNext to validate your configuration
Test Internet Connection -gtYes normally a response appears withrdquoSuccesrdquo Novell Customer Center Configuration -gtConfigure Later Additional installation Sources- -gtNo User Authentication Mrthod -gtcheck Local(etcpasswd New Local User -gtfull namenagios administrator Username nagios
Nagiosadmin and password Hardware Configuration -gtcheck your configuration -gtnext Installation Completed -gtFinish
IT Department Page 7
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
3 Nagios Installation This has been taken from wwwnagiosorg httpnagiossourceforgenetdocs3_0quickstart-opensusehtml
31 Introduction
his guide is intended to provide you with simple instructions on how to install Nagios from source (code) on openSUSE and have it monitoring your local machine inside of 20 minutes No advanced installation options are discussed here - just the basics that will work for 95 of users who want to get started These instructions were written based on an openSUSE 102 installation
32 Required Packages
Make sure youve installed the following packages on your openSUSE installation before continuing You can use yast to install packages under openSUSE apache2 CC++ development libraries Check if you have internet access open a terminal session Nslookup wwwgooglecom
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 6
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Desktop Selection -gtGnome Installation Settings -gtPartionning-gtCreate Custom Partition Setup-gtCustom Partitionning (for Experts) (new documention InstallRAID1odt) On 80 GB hard drive Swap 2G0 boot about 1GB about 53GB home about 20GB Software Installation(Softawre) Just after the RAID select ldquoSoftwarerdquo to add additionnals for Nagios Check ldquodetailsrdquo on bottom-gtin Filter menu-gtSearch-gtapache2-gtselectrdquo apache 2rdquo (28mb) -gtSearch-gtgcc-gtselectrdquogcc the system GNU C Compilerrdquo-gtaccept Agree-gtInstall -gtAutomatic Reboot
Password for System Administrator -gt yoursquore invited to enter ldquoroot passwordrdquo Hostname and Domain Name -gtPursat and cistlan
-gtuncheck ldquoChange Hostname via Dhcprdquo Network Configuration -gtenable SSH in Firewall menu-gtssh open Network Interface -gtclisk on Network Interface-gtedit-gt StaticIp Adress-gt102030442552552550
-gtHostname and Name Server -gtName Server1 and 2=1020304310203040 -gtRouting-gt10203020
-gtNext to validate your configuration
Test Internet Connection -gtYes normally a response appears withrdquoSuccesrdquo Novell Customer Center Configuration -gtConfigure Later Additional installation Sources- -gtNo User Authentication Mrthod -gtcheck Local(etcpasswd New Local User -gtfull namenagios administrator Username nagios
Nagiosadmin and password Hardware Configuration -gtcheck your configuration -gtnext Installation Completed -gtFinish
IT Department Page 7
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
3 Nagios Installation This has been taken from wwwnagiosorg httpnagiossourceforgenetdocs3_0quickstart-opensusehtml
31 Introduction
his guide is intended to provide you with simple instructions on how to install Nagios from source (code) on openSUSE and have it monitoring your local machine inside of 20 minutes No advanced installation options are discussed here - just the basics that will work for 95 of users who want to get started These instructions were written based on an openSUSE 102 installation
32 Required Packages
Make sure youve installed the following packages on your openSUSE installation before continuing You can use yast to install packages under openSUSE apache2 CC++ development libraries Check if you have internet access open a terminal session Nslookup wwwgooglecom
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 7
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
3 Nagios Installation This has been taken from wwwnagiosorg httpnagiossourceforgenetdocs3_0quickstart-opensusehtml
31 Introduction
his guide is intended to provide you with simple instructions on how to install Nagios from source (code) on openSUSE and have it monitoring your local machine inside of 20 minutes No advanced installation options are discussed here - just the basics that will work for 95 of users who want to get started These instructions were written based on an openSUSE 102 installation
32 Required Packages
Make sure youve installed the following packages on your openSUSE installation before continuing You can use yast to install packages under openSUSE apache2 CC++ development libraries Check if you have internet access open a terminal session Nslookup wwwgooglecom
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 8
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
33 Create Account Information
Become the root user su -l Create a new nagios user account and give it a password usrsbinuseradd nagios passwd nagios Create a new nagios group Add the nagios user to the group usrsbingroupadd nagios usrsbinusermod -G nagios nagios Create a new nagcmd group for allowing external commands to be submitted through the web interface Add both the nagios user and the apache user to the group usrsbingroupadd nagcmd usrsbinusermod -G nagcmd nagios usrsbinusermod -G nagcmd wwwrun
34 Dowload Nagios and the Plugins
Create a directory for storing the downloads mkdir ~downloads cd ~downloads Download the source code tarballs of both Nagios and the Nagios plugins (visit httpwwwnagiosorgdownload for links to the latest versions) At the time of writing the latest versions of Nagios and the Nagios plugins were 30a4 and 147 respectively wget httposdndlsourceforgenetsourceforgenagiosnagios-30a4targz wget httposdndlsourceforgenetsourceforgenagiosplugnagios-plugins-147targz
35 Compile and Install Nagios
Extract the Nagios source code tarball cd ~downloads tar xzf nagios-30a4targz cd nagios-30a4 Run the Nagios configure script passing the name of the group you created earlier like so configure --with-command-group=nagcmd Compile the Nagios source code make all Install binaries init script sample config files and set permissions on the external command directory make install make install-init make install-config make install-commandmode Dont start Nagios yet - theres still more that needs to be done
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 9
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
36 Customize Configuration
Edit the localhostcfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address youd like to use for receiving alerts
37 Configure the Web interface
Install the Nagios web config file in the Apache confd directory make install-webconf Create a nagiosadmin account for logging into the Nagios web interface Remember the password you assign to this account - youll need it later htpasswd2 -c usrlocalnagiosetchtpasswdusers nagiosadmin Restart Apache to make the new settings take effect service apache2 restart (or) rcapcah2 start service apache2 status
38 Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball cd ~downloads tar xzf nagios-plugins-147targz cd nagios-plugins-147 Compile and install the plugins configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
39 Start Nagios
Add Nagios and Apache to the list of system services and have it automatically start when the system boots chkconfig --add nagios chkconfig nagios on chkconfig ndash-add apache2 chkconfig apache2 on Verify the sample Nagios configuration files usrlocalnagiosbinnagios -v usrlocalnagiosetcnagioscfg If problems mkdir homenagios chown nagios homenagios If there are no errors start Nagios service nagios start
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 10
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
310 Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below Youll be prompted for the username (nagiosadmin) and password you specified earlier httplocalhostnagios or httppursatnagios Click on the Service Detail navbar link to see details of whats being monitored on your local machine It will take a few minutes for Nagios to check all the services associated with your machine as the checks are spread out over time
311 Other Modifications
Make sure your machines firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely You can do this by Opening the control center Select Open Administrator Settings to open the YaST administrator control center Select Firewall from the Security and Users category Click the Allowed Services option in the Firewall Configuration window Add HTTP Server to the allowed services list for the External Zone Click Next and Accept to activate the new firewall settings
4 Nagios Documentations The major site of nagios is wwwnagiosorg
Online documentation is httpnagiossourceforgenetdocs3_0
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 11
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
You should also access to documentation directly from the web major interface of nagios
5 Thanks Special thanks to Ethan Galstad the nagios developer
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 12
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
6 Nagios Files
61 Introduction
The major mechanism of nagios is based on configuration file (cfg files) Those files have to been modified with your favorite Editor (VIhellipgedithellip) to activate new configuration You need to have root access to modify them
7 Basics of Nagios
71 Description
- Nagios is working with differents configurations files which are based on hostsobjects configuration
72 What Are Objetcs
Objects are all the elements that are involved in the monitoring and notification logic Types of objects include Services Service Groups Hosts Host Groups Contacts Contact Groups
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 13
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Commands Time Periods Notification Escalations Notification and Execution Dependencies More information on what objects are and how they relate to each other can be found below
73 Where Are Objects Defined
Objects are defined in one or more configuration files that you specify using the cfg_file andor cfg_dir directives in the main configuration file
74 How Are Objects Defined
Objects are defined in a flexible template format which can make it much easier to manage your Nagios configuration in the long term Basic information on how to define objects in your configuration files can be
found here
Once you get familiar with the basics of how to define objects you should read up on object inheritance as it will make your configuration more robust for the future Seasoned users can exploit some advanced
features of object definitions as described in the documentation on object tricks
75 Objects Explained
Some of the main object types are explained in greater detail below
Hosts are one of the central objects in the monitoring logic Important attributes of hosts are as follows
Hosts are usually physical devices on your network (servers workstations routers switches printers etc)
Hosts have an address of some kind (eg an IP or MAC address)
Hosts have one or more more services associated with them
Hosts can have parentchild relationships with other hosts often representing real-world network connections which is used in the network reachability logic
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 14
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Services are one of the central objects in the monitoring logic Services are associated with hosts and can be
Attributes of a host (CPU load disk usage uptime etc)
Services provided by the host (HTTP POP3 FTP SSH etc)
Other things associated with the host (DNS records etc)
Contacts are people involved in the notification process
Contacts have one or more notification methods (cellphone pager email instant messaging etc)
Contacts receive notifications for hosts and service they are responsible for
Timeperiods are are used to control
When hosts and services can be monitored
When contacts can receive notifications
Commands are used to tell Nagios what programs scripts etc it should execute to perform
Host and service checks
Notifications
Event handlers
and more
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 15
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Configurations Files For CIST
81 Location
All the configurations files are in usrlocalnagiosetc
82 Backup the Configuration Files
The only files your need are thoses files Windowscfg switchcfg resourcescfg printercfg nagioscfg localhostcfg commandscfg cgicfg and httpasswdcfg So just copy them in your favorite directory thatrsquos it
83 Nagioscfg
Nagioscfg is the master file to execute all other files So by default nagios is ldquonot openrdquo So you need to accept the reading of others configurations files Check to lt --------- UNCOMMENT to activate them
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 16
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
NAGIOSCFG - Sample Main Config File for Nagios 30a4
Read the documentation for more information on this configuration
file Ive provided some comments here but things may not be so
clear without further explanation
Last Modified 05-08-2007
LOG FILE
This is the main log file where service and host events are logged
for historical purposes This should be the first option specified
in the config file
log_file=usrlocalnagiosvarnagioslog
OBJECT CONFIGURATION FILE(S)
These are the object configuration files in which you define hosts
host groups contacts contact groups services etc
You can split your object definitions across several config files
if you wish (as shown below) or keep them all in a single config file
Command definitions
cfg_file=usrlocalnagiosetccommandscfg lt --------- UNCOMMENT
Host and service definitions etc for monitoring this machine
cfg_file=usrlocalnagiosetclocalhostcfg lt --------- UNCOMMENT
Sample definitions for monitoring a Windows machine
cfg_file=usrlocalnagiosetcwindowscfg lt --------- UNCOMMENT
Sample definitions for monitoring a network printer
cfg_file=usrlocalnagiosetcprintercfg lt --------- UNCOMMENT
Sample definitions for monitoring a switchrouter
cfg_file=usrlocalnagiosetcswitchcfg lt --------- UNCOMMENT
You can also tell Nagios to process all config files (with a cfg
extension) in a particular directory by using the cfg_dir
directive as shown below
cfg_dir=usrlocalnagiosetcservers
cfg_dir=usrlocalnagiosetcprinters
cfg_dir=usrlocalnagiosetcswitches
cfg_dir=usrlocalnagiosetcrouters
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 17
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
84 Localhostcfg
Localhost is concerning by default the nagios host but you can copypaste all of its configuration in the same localhost file to add your new hosts Typically for UNIXLINUX machines For Windows machines there is another file windowscfg
LOCALHOSTCFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
Last Modified 02-27-2007
NOTE This config file is intended to serve as an extremely simple
example of how you can create your object configuration file(s)
TIME PERIODS
This defines a timeperiod where all times are valid for checks
notifications etc The classic 24x7 support nightmare -)
define timeperiod
timeperiod_name 24x7
alias 24 Hours A Day 7 Days A Week
sunday 0000-2400
monday 0000-2400
tuesday 0000-2400
wednesday 0000-2400
thursday 0000-2400
friday 0000-2400
saturday 0000-2400
workhours timeperiod definition
define timeperiod
timeperiod_name workhours
alias Normal Work Hours
monday 0900-1700
tuesday 0900-1700
wednesday 0900-1700
thursday 0900-1700
friday 0900-1700
none timeperiod definition
define timeperiod
timeperiod_name none
alias No Time Is A Good Time
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 18
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
COMMANDS
NOTE Sample command definitions can now be found in the sample commandscfg file
CONTACTS
Generic contact definition template - This is NOT a real contact just a template
define contact
name generic-contact The name of this contact
template
service_notification_period 24x7 service notifications can be
sent anytime
host_notification_period 24x7 host notifications can be
sent anytime
service_notification_options wucrfs send notifications for all
service states flapping events and scheduled downtime events
host_notification_options durfs send notifications for all
host states flapping events and scheduled downtime events
service_notification_commands notify-service-by-email send service
notifications via email
host_notification_commands notify-host-by-email send host
notifications via email
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL CONTACT JUST A TEMPLATE
Just one contact defined by default - the Nagios admin (thats you)
define contact
contact_name nagiosadmin Short name of user
use generic-contact Inherit default values from
generic-contact template (defined above)
alias Nagios Admin Full name of user
email itsupportcistrainorg ltlt CHANGE THIS
TO YOUR EMAIL ADDRESS
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 19
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
CONTACT GROUPS
We only have one contact in this simple configuration file so there is
no need to create more than one contact group
define contactgroup
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
HOSTS
Generic host definition template - This is NOT a real host just a template
define host
name generic-host The name of this host template
notifications_enabled 1 Host notifications are enabled
event_handler_enabled 1 Host event handler is enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information across
program restarts
retain_nonstatus_information 1 Retain non-status information
across program restarts
notification_period 24x7 Send host notifications at any
time
register 0 DONT REGISTER THIS DEFINITION -
ITS NOT A REAL HOST JUST A TEMPLATE
Linux host definition template - This is NOT a real host just a template
define host
name linux-server The name of this host template
use generic-host This template inherits other values from
the generic-host template
check_period 24x7 By default Linux hosts are checked round
the clock
check_interval 5 Actively check the host every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each Linux host 10 times (max)
check_command check-host-alive Default command to check Linux hosts
notification_period workhours Linux admins hate to be woken up so we
only notify during the day
Note that the notification_period
variable is being overridden from
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 20
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
the value that is inherited from the
generic-host template
notification_interval 120 Resend notifications every 2 hours
notification_options dur Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS DEFINITION - ITS NOT A
REAL HOST JUST A TEMPLATE
Since this is a simple configuration file we only monitor one host - the
local host (this machine)
add here after your new linux
or similars servers
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name Nagios )
alias localhost )
address 127001 )
icon_image ultrapenguinpng ) Naggios Host
vrml_image ultrapenguinpng )
statusmap_image ultrapenguinpng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name kohkong )
alias localhost )
address 19216825 )
parents CistSW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
define host
use linux-server Name of host template to use
This host definition will inherit all
variables that are defined
in (or inherited by) the linux-server
host template definition
host_name takeo )
alias localhost )
address 1921681119216821192168311721601
parents CistSW001 )
icon_image susepng ) takeo
vrml_image susepng ) ldquordquo to have multi ip addrees
statusmap_image susepng )
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 21
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
HOST GROUPS
We only have one host in our simple config file so there is no need to
create more than one hostgroup
define hostgroup
hostgroup_name allhosts
alias All Hosts
members Nagioskohkongtakeo Add your New Host
the groups allhost
Here like kohkong
SERVICES
Generic service definition template - This is NOT a real service just a template
define service
name generic-service The name of this service
template
active_checks_enabled 1 Active service checks are
enabled
passive_checks_enabled 1 Passive service checks are
enabledaccepted
parallelize_check 1 Active service checks should
be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 We should obsess over this
service (if necessary)
check_freshness 0 Default is to NOT check
service freshness
notifications_enabled 1 Service notifications are
enabled
event_handler_enabled 1 Service event handler is
enabled
flap_detection_enabled 1 Flap detection is enabled
failure_prediction_enabled 1 Failure prediction is
enabled
process_perf_data 1 Process performance data
retain_status_information 1 Retain status information
across program restarts
retain_nonstatus_information 1 Retain non-status
information across program restarts
is_volatile 0 The service is not volatile
check_period 24x7 The service can be checked
at any time of the day
max_check_attempts 3 Re-check the service up to 3
times in order to determine its final (hard) state
normal_check_interval 10 Check the service every 10
minutes under normal conditions
retry_check_interval 2 Re-check the service every
two minutes until a hard state can be determined
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 22
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
contact_groups admins Notifications get sent out
to everyone in the admins group
notification_options wucr Send notifications
about warning unknown critical and recovery events
notification_interval 60 Re-notify about service
problems every hour
notification_period 24x7 Notifications can be sent
out at any time
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
Local service definition template - This is NOT a real service just a template
define service
name local-service The name of this service
template
use generic-service Inherit default values from
the generic-service definition
max_check_attempts 4 Re-check the service up to 4
times in order to determine its final (hard) state
normal_check_interval 5 Check the service every 5
minutes under normal conditions
retry_check_interval 1 Re-check the service every
minute until a hard state can be determined
register 0 DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE JUST A TEMPLATE
START COPYPASTE for SERVICES
Define a service to ping the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use local-service Name of service template
to use
host_name Nagios
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Users
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 23
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name Nagios
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name Nagios
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name Nagios
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use local-service Name of service template
to use
host_name Nagios
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use local-service Name of service template
to use
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 24
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name Nagios
service_description HTTP
check_command check_http
notifications_enabled 0
End COPYPASTE for SERVICES
Define a service to ping the local machine kohkong kohkong
define service
use generic-service Name of service template
to use
host_name kohkong
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use generic-service Name of service template
to use
host_name kohkong
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use generic-service Name of service
template to use
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 25
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
host_name kohkong
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use generic-service Name of service
template to use
host_name kohkong
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
Define a service to ping the local machine takeo takeo
define service
use local-service Name of service template
to use
host_name takeo
service_description PING
check_command check_ping100020500060
Define a service to check the disk space of the root partition
on the local machine Warning if lt 20 free critical if
lt 10 free space on partition
define service
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 26
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
use local-service Name of service template
to use
host_name takeo
service_description Root Partition
check_command check_local_disk2010
Define a service to check the number of currently logged in
users on the local machine Warning if gt 20 users critical
if gt 50 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Users
check_command check_local_users2050
Define a service to check the number of currently running procs
on the local machine Warning if gt 250 processes critical if
gt 400 users
define service
use local-service Name of service template
to use
host_name takeo
service_description Total Processes
check_command check_local_procs250400RSZDT
Define a service to check the load on the local machine
define service
use local-service Name of service template
to use
host_name takeo
service_description Current Load
check_command check_local_load5040301006040
Define a service to check the swap usage the local machine
Critical if less than 10 of swap is free warning if less than 20 is free
define service
use local-service Name of service template
to use
host_name takeo
service_description Swap Usage
check_command check_local_swap2010
Define a service to check SSH on the local machine
Disable notifications for this service by default as not all users may have SSH
enabled
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 27
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
define service
use local-service Name of service template
to use
host_name takeo
service_description SSH
check_command check_ssh
notifications_enabled 0
Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
okdefine service
ok use local-service Name of service
template to use
ok host_name takeo
ok service_description HTTP_8080
ok check_command check_http_8080
ok notifications_enabled 0
ok
85 Explanations of localhost file and services
851 Creating A Host Definition
Before you can monitor a service you first need to define a host that is associated with the service If you have already created a host definition you can skip this step For this example lets say you want to monitor a variety of services on a remote host Lets call that host remotehost The host definition can be placed in its own file or added to an already exiting object configuration file Heres what the host definition for remotehost might look like define host
use generic-host Inherit default values from a
template
host_name remotehost The name were giving to this
host
alias Some Remote Host A longer name associated with the
host
address 192168150 IP address of the host
hostgroups allhosts Host groups this host is
associated with
So like this
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 28
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
Now that a definition has been added for the host that will be monitored we can start defining services that should be monitored As with host definitions service definitions can be placed in any object configuration file
852 Monitoring HTTP
Chances are youre going to want to monitor web servers at some point - either yours or someone elses The check_http plugin is designed to do just that It understands the HTTP protocol and can monitor response time error codes strings in the returned HTML server certificates and much more The commandscfg file contains a command definition for using the check_http plugin It looks like this define command
name check_http
command_name check_http
command_line $USER1$check_http -I $HOSTADDRESS$ $ARG1$
A simple service definition for monitoring the HTTP service on the remotehost machine might look like this define service
use generic-service Inherit default values from a
template
host_name remotehost
service_description HTTP
check_command check_http
This simple service definition will monitor the HTTP service running on remotehost It will produce alerts if the web server doesnt respond within 10 seconds or if it returns HTTP errors codes (403 404 etc) Thats all you need for basic monitoring Pretty simple huh Here after our exemple with takeo for http Define a service to check HTTP on the local machine
Disable notifications for this service by default as not all users may have HTTP
enabled
define service
use generic-service Name of service template
to use
host_name kohkong
service_description HTTP
check_command check_http
notifications_enabled 0
And we can do this for all services If we would like to create a new service like http8080 here you are an exemple
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 29
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
86 Others cfg files switchcfg printercfghellip
All other files have the same mechanism than the localhostcfg file The major principe is that they are specially designed for ldquoprintersrdquo ldquoswitchsrdquo and ldquowindowsrdquo But for windows server and linux servers remember that you need a agent on the server to scan the services Windows Machines -gt Nsclientexe Linux Machines -gt Nagios-plugin-147
9 Monitoring Windows Machines
91 Introduction
This document describes how you can monitor private services and attributes of Windows machines such as Memory usage CPU load Disk usage Service states Running processes etc Publicly available services that are provided by Windows machines (HTTP FTP POP3 etc) can be
monitored easily by following the documentation on monitoring publicly available services
Notes
These instructions assume that youve installed Nagios according to the quickstart guide The sample configuration entries below reference objects that are defined in the sample commandscfg and localhostcfg config files For your convenience the configuration examples given below can be found in a sample windowscfg config file that gets installed when you following the quickstart guide After reading these instructions just edit the windowscfg file to customize the host name IP address etc and uncomment the reference to the windowscfg file in the nagioscfg file
92 Installing the Windows Agent
Before you can begin monitoring private services and attributes of Windows machines youll need to install an agent on those machines I recommend using the NSClient++ addon which can be found at
httpsourceforgenetprojectsnscplus These instructions will take you through a basic installation of the NSClient++ addon as well as the configuration of Nagios for monitoring the Windows machine
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 30
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1 Download the latest stable version of the NSClient++ addon from
httpsourceforgenetprojectsnscplus
2 Unzip the NSClient++ files into a new CNSClient++ directory 3 Open a command prompt and change to the CNSClient++ directory 4 Register the NSClient++ system service with the following command nsclient++ install 5 Install the NSClient++ systray with the following command nsclient++ SysTray Beware of the path where is installed the Nsclient++ 6 Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the Log On tab of the services manager) If it isnt already allowed to interact with the desktop check the box to allow it to
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 31
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
7 Edit the NSCINI file (located in the CNSClient++ directory) and uncomment the allowed_hosts option Add the IP address of the Nagios server to this line or leave it blank to allow all hosts to connect
In our case (CIST) we have mad change with a new Firewall (Takeo) so we need to add for the route back the IP address of the Lan Interface of Takeo if this one is not in the same network as Nagios and the Nsclient Allowed host options are in ldquoremarkrdquo mode and has to be activated like this
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 32
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
8 Start the NSClient++ service with the following command nsclient++ start 9 If installed properly a new icon should appear in your system tray It will be a yellow circle with a black M inside 10 Success The Windows server can now be added to the Nagios monitoring configuration
93 Nagios Host Configuration
Youll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine These definitions can be placed in their own file or added to an already exiting object configuration file First its best practice to create a new template for each different type of host youll be monitoring Lets create a new template for Windows server define host
name windows-server The name of this host template
use generic-host Inherit default values from the
generic-host template
check_period 24x7 By default Windows servers are
monitored round the clock
check_interval 5 Actively check the server every 5
minutes
retry_interval 1 Schedule host check retries at 1
minute intervals
max_check_attempts 10 Check each server 10
times (max)
check_command check-host-alive Default command to check
if servers are alive
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 33
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
notification_period 24x7 Send notification out at any time
- day or night
notification_interval 30 Resend notifications every 30
minutes
notification_options dr Only send notifications for
specific host states
contact_groups admins Notifications get sent to the
admins by default
register 0 DONT REGISTER THIS - ITS
JUST A TEMPLATE
Notice that the Windows server template definition is inheriting default values from the generic-host template which is defined in the sample localhostcfg file
Next define a new host for the Windows machine that references the newly created windows-server host template
define host
use windows-server Inherit default values from a template
host_name winserver The name were giving to this
host
alias My Windows Server A longer name associated with the
host
address 19216812 IP address of the host
hostgroups allhosts Host groups this server is
associated with
Add an optional hostgroup for Windows servers This is useful if you create additional servers in the future
and want to view them together in the CGIs It can also be useful for object definition tricks that you can use to manage larger configurations later on define hostgroup
hostgroup_name windows-servers The name of the hostgroup
alias Windows Servers Long name of the group
members winserver Comma separated list of hosts
that belong to this group
The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhostcfg) and windows-servers (which is defined above)
94 Monitoring Services
Now that the NSCLient++ addon has been installed on the Windows machine and youve configured a host definition for the machine in Nagios you can addon some service definitions for things you want to monitor All of the service examples Ill cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commandscfg file It looks like this define command
command_name check_nt
command_line $USER1$check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$
$ARG2$
Now lets go over some example service definitions for monitoring different aspects of the Windows machine
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 34
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
95 Monitoring NSClient++ Version
The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server This is useful when it comes time to upgrade your Windows servers to a newer version of the addon define service
use generic-service
host_name winserver
service_description NSClient++ Version
check_command check_ntCLIENTVERSION
96 Monitoring Uptime
The following service definition will allow you to monitor the uptime of the Windows server define service
use generic-service
host_name winserver
service_description Uptime
check_command check_ntUPTIME
97 Monitoring Cpu Load
The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90 or more or a WARNING alert if the 5-minute load is 80 or greater define service
use generic-service
host_name winserver
service_description CPU Load
check_command check_ntCPULOAD-l 58090
98 Monitoring Memory Usage
The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90 or more or a WARNING alert if memory usage is 80 or greater define service
use generic-service
host_name winserver
service_description Memory Usage
check_command check_ntMEMUSE-w 80 -c 90
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 35
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
99 Monitoring Disk Usage
The following service definition will monitor usage of the C drive on the Windows server and generate a CRITICAL alert if disk usage is 90 or more or a WARNING alert if disk usage is 80 or greater define service
use generic-service
host_name winserver
service_description C Drive Space
check_command check_ntUSEDDISKSPACE-l c -w 80 -c 90
910 Monitoring A Windows Service
The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped define service
use generic-service
host_name winserver
service_description W3SVC
check_command check_ntSERVICESTATE-d SHOWALL -l W3SVC
911 Monitoring A Windows Process
The following service definition will monitoring the Explorerexe process on the Windows machine and generate a CRITICAL alert if the process is not running define service
use generic-service
host_name winserver
service_description Explorer
check_command check_ntPROCSTATE-d SHOWALL -l Explorerexe
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 36
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
10 Statusmap
101 How to have a smoothly map
The Statusmap is the Human Visuable status of the CIST Network
But to have this smoth map we need ldquosmooth iconsrdquo
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 37
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
And in our case we have also change the default thems of nagios by another one
-gt
102 Add Changing Icons
The icons of nagios exists in three Formats GIFGD2 and GIF But the best thning to do is to use only PNG file cause you sould you havethe same icon for all differents modules of Nagios Sample with kohkong
host_name kohkong )
alias localhost )
address 19216825 )
parents Cist SW001 ) kohkong
icon_image susepng )
vrml_image susepng )
statusmap_image susepng )
We use in this case susepng
1021 Icon image
is for the normal menu of nagios
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 38
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
1022 Vrml_image
is for the 3D Map environment but because of our special thems of nagios we donrsquot use In case of using the 3DMap the Windows Explorer or Firefox need a special plugin to run correctly You can find it at httpwwwparallelgraphicscomproductscortona ldquoCortona vrml clientrdquo
1023 Statusmap_image
is for the 2D Status Map the one we do use
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 39
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
We do use special icons for it Those ones can be find at httpwwwnagiosexchangeorgImage_Packs750html the best fit is GND format in 40x40 pcx So you can also convert all your icons you find on internet to this special format Here it is a online tool to do this httpwwweasypictorg Where to put the icons The icons has to be put with your favorite SSH explorer in usrlocalnagiosshareimages
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1
IT Department Page 40
28-Aug-12 CIST IT - InstampConf - Nagios Project - V01 DRAFT - 2007615doc
11 Cist Monitored hosts Here after the map of all the hosts sacanned by Nagios All the red machines are monitored by Nagios But the list is not limitedhellip
1921682028
1921683026
172160023
Primary Secondary
Domain Controller
DNS DHCP NTP WSUS
- KAMPOT amp KEP -HP Proliant
Windows 2003 Server
32 GHz 2 GB 148 GB RAID 1
SMTP POP Antispam
Mail Antivirus
- KOHKONG -HP Proliant
Open SuSE 102
32 GHz 2 GB 280 GB RAID 5
Students Files Server Moodle
Antivirus ERO Instant Messaging
- KANDAL -HP Proliant
Windows 2003 Server
32 GHz 2 GB 280 GB RAID 5
Proxy Firewall
- TAKEO -HP Desktop
SuSE LES 102
32 GHz 2 GB 80 GB RAID 1
ADSL Gateway
- MODEM -512 Mbs
Fixed Public IP Address
Supervision
- PURSAT -PTC Desktop
Open SuSE 102
26 GHz 2 GB 80 GB RAID 1
Common Servers
Students PCs (~70 PCs) + VMWare
amp Virtual Company (Internet Access Only)Internet Access
Staff Servers Staff PCs Printers amp WiFi (~40 PCs)
1921681030
Learning Management DataBase
Print server Staff Files Server
- PAILIN -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 320 GB RAID 1
Internet
CISTSW001
CISTSW003
CISTSW002
CISTSW006
CISTSW004CISTSW005
Data backup (Kohkong Kandal
Pailin) Ghost server
- PREYVENG -PTC Desktop
Windows 2003 Server
26 GHz 2 GB 500 GB RAID 1