Network Monitoring with Nagios - EDUCAUSE Homepage |

transcript

Network Monitoring with Nagios

Matt Gracie, Information Security AdministratorCanisius College, Buffalo, NY

Canisius College is one of 28 Jesuit colleges in the nation and the premier private college in Western New York. Canisius prepares leaders – intelligent, caring, faithful individuals – able to pursue and promote excellence in their professions, communities and service to humanity.

What is Nagios?

“Nagios is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do.”

-- from www.nagios.org

What Is Required?

Nagios is an open source tool released under the terms of the GNU General Public License (GPL). There is no cost to use the software, unless you choose to pay for professional support.

Nagios was originally written to run under Linux, but it should work under almost any Unix variant with a C compiler. Additionally, the machine should have an HTTP server and a TCP stack available.

How does it work?

“The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to

Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical

logs, and reports can all be accessed via a web browser.”

--from www.nagios.org

Why Did We Start Using It?

Sometimes, nobody knows who is responsible for a particular computer. Generally, this turns out to be a problem.

Our Setup

We are running Nagios 1.4 on Debian GNU/Linux. It is being used to constantly monitor 584 services across 345 different hosts, ranging from switches and routers to various types of servers. Though our installation is a little behind the current 2.X series, there are very few differences in functionality, so the content of this presentation should be applicable to an installation of a newer version as well.

The initial, test implementation was running on a 1.4 Ghz Pentium 4 desktop computer. Currently, we are running it on a 2.6 Ghz Opteron blade.

Configuration Files

The majority of the configuration in Nagios is handled via text files in the /etc/nagios directory of the server. While there are many additional files and options, the truly necessary ones for a basic implementation are: ● hosts.cfg● hostgroups.cfg● contacts.cfg● contactgroups.cfg● services.cfg

The Simplest Setup

In this configuration, the Nagios Server will periodically ping the Web Server to make sure that it is still responding to network traffic. If it stops responding, the Nagios Server will alert the people designated as contacts for the Web Server.

The Nagios Server will also use plugins, such as check_disk, to monitor its own status.

In this case, the Nagios Server will use the NRPE (Nagios Remote Plugin Execution) service running on the Web Server to run plugins on the remote machine. This allows the Nagios Server to monitor private information, such as CPU load or disk space, on the Web Server, and send alerts based on configured thresholds.

In this case, the Nagios Server cannot reach the Database Server because of the intervening firewall. So, instead, the Database Server uses the send_nsca utility to submit passive reports to the Nagios Server. This allows the Nagios Server to stay abreast of the Database Server's status without requiring a specific firewall hole to accommodate it.

More Advanced Features

Scheduled Downtime

It is possible, using the web interface for Nagios, to schedule downtime for hosts or individual services. This means that the checks for those hosts and services will be suspended until the scheduled downtime ends.

Event Handlers

An event handler is an automatic script that kicks off when a monitored service enters a particular state.

For example, if a machine is prone to having its Apache installation crash, a script could be written that attempts to restart the daemon and only mails the admin if this is unsuccessful.

Notification EscalationsNagios supports Notification Escalations, a method of alerting additional or different groups of administrators when a problem goes unsolved.

define serviceescalation{host_name webserverservice_description HTTPfirst_notification 3last_notification 5notification_interval 90contact_groups nt-

admins,managers}

Initial Alert

define serviceescalation{host_name webserverservice_description HTTPfirst_notification 6last_notification 10notification_interval 60contact_groups nt-

admins,managers,everyone}

Escalated Alert

Service Dependencies

In a case like this, services on the Web Server will be unreachable from the Nagios Server if there's something wrong with the Border Router. Nagios allows for Service Dependencies to be added to the configuration, so that in a situation like that, the Web Server would not be checked until the Border Router is functioning properly again.

Distributed Monitoring

Alternative Frontends

Because Nagios stores all of its information in a standard MySQL database, it's very simple to draw that information out for other purposes. This is a web frontend to the Nagios database that I made for our Help Desk, so that they can see in real time if a host is down.

Third-Party Tools

There are a lot of third-party tools available for Nagios.

● Fruity

● Nagmin

● Nagat

● nagiosgraph

● And many, many more...

Summary

More Information

Nagios Home Pagehttp://www.nagios.org

Nagios Exchange (Third-party products)http://www.nagiosexchange.org

Complete Guide to Nagioshttp://www.nagiosbook.org

Network Monitoring with Nagioshttp://www.linuxjournal.com/article/6767

Network Monitoring with Nagios - EDUCAUSE Homepage |

Documents