Wednesday, March 12, 2014

The Great Nagios Adventure

Do you know the answer to the following questions for the Servers you support? Is my server up or down? Is the hard drive space filling up? What is the CPU load? Is the website actually displaying? Well if you don't know the answer to these questions you should. How do you do it you ask? The answer is Nagios.. Nagios is an open source web app running on linux that can monitor multiple servers in your network. I have been using Nagios for several years, but recently had to actually install and configure it for the first time. The installation is very easy, but depending on your environment, the set up and configuration can be tricky.


You can get the Nagios files here. If you are a DIY guy like me then you want to download the Nagios Core version. This the free Open source version.. There are other paid support versions as well. 

So .. this is how it goes.. step by step. 

1.Get the Prerequisites. You need the following packages already installed.
a.       Apache
b.      PHP
c.       GCC Compiler
d.      GD Development Libraries
Yum can be used to install each one as seen below.
yum install httpd php
yum install gcc glibc glibc-common
yum install gd gd-devel

2.      Account Setup
a.       First make sure you are using root permissions.
                                                              i.      Su –
b.      Create the Nagios Account and give it a password.
                                                              i.      /usr/sbin/useradd -m nagios
                                                            ii.      passwd nagios

c.       Create a new nagios nagcmd group for allowing external commands used via the web ui. Add the nagios user and the apache user to this group.

/usr/sbin/groupadd nagcmd
/usr/sbin/usermod -a -G nagcmd nagios
/usr/sbin/usermod -a -G nagcmd apache

3.      Get the Nagios Core Install and Plugins
a.       Create and navigate to a download directory
                                                              i.      mkdir ~/downloads
b.      Navigate to /downloads
                                                              i.      cd /downloads
4.      Get the files downloaded into the directory

5.      Install Nagios
a.       First uncompressed the downloaded files
                        i.   tar xzf nagios-4.0.4.tar.gz
                      ii.   tar xzf nagios-plugins-2.0.tar.gz

b.      Navigate to the Nagios core file folder
                        i.   cd /nagios-4.0.4
c.       Run the Nagios Configure Script including the New Group Name
                                                              i.      ./configure --with-command-group=nagcmd
d.      Compile Nagios Core
                                                              i.      make all
e.       Install it all
                                                              i.      make install
                                                            ii.      make install-init
                                                          iii.      make install –config
                                                          iv.      make install –commandmode


Don't start Nagios yet - there's still more that needs to be done... 



6) Customize Configuration 



Sample configuration
files have now been installed in the /usr/local/nagios/etc
directory. These sample files should work fine for getting started with Nagios.
You'll need to make just one change before you proceed... 



Edit the /usr/local/nagios/etc/objects/contacts.cfg config file with
your favorite editor and change the email address associated with the nagiosadmin
contact definition to the address you'd like to use for receiving alerts. 



 
vi /usr/local/nagios/etc/objects/contacts.cfg
 
7) Configure the Web Interface Install the Nagios web config file in the Apache conf.d directory.
 
make install-webconf
 
Create a nagiosadmin account for logging into the Nagios web interface. Remember the password you assign to this account - you'll need it later.
 
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
 
Restart Apache to make the new settings take effect.
 
service httpd restart
 
Note: Consider implementing the ehanced CGI security measures described here to ensure that your web authentication credentials are not compromised. 8) Compile and Install the Nagios Plugins Extract the Nagios plugins source code tarball.
 
cd ~/downloads
 
tar xzf nagios-plugins-1.4.11.tar.gz
 
cd nagios-plugins-1.4.11
 
Compile and install the plugins.
 
./configure --with-nagios-user=nagios --with-nagios-group=nagios
 
make
 
make install
 
9) Start Nagios Add Nagios to the list of system services and have it automatically start when the system boots.
 
chkconfig --add nagios
 
chkconfig nagios on
 
Verify the sample Nagios configuration files.
 
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
 
If there are no errors, start Nagios.
 
service nagios start
 
10) Modify SELinux Settings Fedora ships with SELinux (Security Enhanced Linux) installed and in Enforcing mode by default. This can result in "Internal Server Error" messages when you attempt to access the Nagios CGIs. See if SELinux is in Enforcing mode.
 
getenforce
 
Put SELinux into Permissive mode.
 
setenforce 0
 
To make this change permanent, you'll have to modify the settings in /etc/selinux/config and reboot. Instead of disabling SELinux or setting it to permissive mode, you can use the following command to run the CGIs under SELinux enforcing/targeted mode:
 
chcon -R -t httpd_sys_content_t /usr/local/nagios/sbin/
 
chcon -R -t httpd_sys_content_t /usr/local/nagios/share/
 
For information on running the Nagios CGIs under Enforcing mode with a targeted policy, visit the Nagios Support Portal or Nagios Community Wiki. 11) Login to the Web Interface You should now be able to access the Nagios web interface at the URL below. You'll be prompted for the username (nagiosadmin) and password you specified earlier.
 
http://localhost/nagios/
 
Click on the "Service Detail" navbar link to see details of what's being monitored on your local machine. It will take a few minutes for Nagios to check all the services associated with your machine, as the checks are spread out over time. 12) Other Modifications Make sure your machine's firewall rules are configured to allow access to the web server if you want to access the Nagios interface remotely. Configuring email notifications is out of the scope of this documentation. While Nagios is currently configured to send you email notifications, your system may not yet have a mail program properly installed or configured. Refer to your system documentation, search the web, or look to the Nagios Support Portal or Nagios Community Wiki for specific instructions on configuring your system to send email messages to external addresses. More information on notifications can be found here.  You're Done Congratulations! You sucessfully installed Nagios. Your journey into monitoring is just beginning.

Install and Configure the Nagios NRPE - Plugin - on a Remote Client

Install NRPE on Linux


NRPE is called as ‘Nagios Remote Plugin Executere’. NRPE allows nagios server to remotely execute nagios commands or plugins on other Linux/Unix machines. NRPE is also available for windows servers.


The following will show how to install NRPE on Linux from code.

For this example I am installing it on Linux version 2.6.18-308.1.1.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)) #1 SMP Wed Mar 7 04:16:51 EST 2012
1.       Install NRPE required packages.
a.       [root@server ~]# yum install openssl-devel gcc xinetd make.

2.       NRPE runs under the user "nagios", so let's add that user.
a.       [root@server ~]# useradd nagios

3.       Download and install Nagios-plugins.
a.       Create or choose a download folder for the plugin files. i.e.: /downloads then navigate to that folder and run the below command.
b.      [root@server downloads]# wget https://www.nagios-plugins.org/download/nagios-plugins-1.5.tar.gz
c.       [root@server downloads]# tar -xvzf nagios-plugins-1.5.tar.gz
d.      [root@server downloads]# cd nagios-plugins-1.5
e.      [root@server nagios-plugins-1.5]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
f.        [root@server nagios-plugins-1.5]# make install

4.       Assign nagios user ownership to nagios commands.
a.       [root@server nagios-plugins-1.5]# chown -R nagios:nagios /usr/local/nagios/libexec

5.       Install and configure NRPE nagios client.
a.       Choose and navigate to a download directory i.e. /downloads
b.      [root@server downloads]# wget http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.15/nrpe-2.15.tar.gz?r=&ts=1363788540&use_mirror=hivelocity
c.       [root@server downloads]# tar -xvzf nrpe-2.15.tar.gz
d.      [root@server downloads]# cd nrpe-2.15
e.      [root@server nrpe-2.15]# ./configure --enable-ssl --enable-command-args
f.        [root@server nrpe-2.15]# make all
g.       [root@server nrpe-2.15]# make install-plugin
h.      [root@server nrpe-2.15]# make install-daemon
i.         [root@server nrpe-2.15]# make install-daemon-config –(this command may not work ignore and continue)
j.        [root@server nrpe-2.15]# make install-xinetd



6.       NRPE will run under xinetd daemon. So update xinetd file.
a.       [root@server ~]# vi /etc/xinetd.d/nrpe

# default: on
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
        flags           = REUSE
        socket_type     = stream
        port            = 5666
        wait            = no
        user            = nagios
        group           = nagios
        server          = /usr/local/nagios/bin/nrpe
        server_args     = -c /usr/local/nagios/etc/nrpe.cfg --inetd
        log_on_failure  += USERID
        disable         = no
        only_from       = Your Nagios host server  localhost }

7.       Add following line at the end of file for NRPE in /etc/services
a.       [root@server ~]# vi /etc/services
b.      nrpe            5666/tcp                # NRPE

8.       Determine Drives to monitor
a.       [root@server~]# df –h
Example output
[root@clientserver ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      2.0G  1.4G  458M  76% /
/dev/mapper/VolGroup01-LogVol00
                      1.4T  713G  554G  57% /u1
/dev/mapper/VolGroup00-LogVol02
                      2.0G   68M  1.8G   4% /tmp
/dev/mapper/VolGroup00-LogVol01
                      3.9G  402M  3.3G  11% /var
/dev/mapper/VolGroup00-LogVol03
                      4.9G  2.3G  2.4G  49% /usr
/dev/sda1              99M   78M   16M  84% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm

b.      From the example we will want to monitor the following Volumes
                                                               i.      /dev/mapper/VolGroup00-LogVol00 – this is the main OS drive
                                                              ii.      /dev/mapper/VolGroup01-LogVol00 – this is the U1 partition or drive
c.        Using the drive information above you need to modify and add lines to the /usr/local/nagios/etc/nrpe.cfg
d.        Go to the section under hardcoded command arguments. Modify or add the following
                                                               i.      command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/mapper/VolGroup00-LogVol00
                                                              ii.      command[check_hda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/mapper/VolGroup01-LogVol00

e.      There is no need to modify the other hardcoded arguments – these other arguments control CPU load and other checks.

9.       Start/Restart xinetd service.
a.       Any change on the nrpe.cfg file requires xinetd service to be restarted.
                                                               i.      [root@server ~]# service xinetd start
                                                             ii.      [root@server ~]# chkconfig xinetd on


Updating the Nagios host server to monitor the Remote Client Servers


1.       Add a command definition to the command.cfg file for the NRPE Plugin.
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

2.       Add a service definitions to the services.cfg file to monitor the remote host’s drives and CPU load.
 Examples NRPE definitions -(Keep in mind that you can add multiple servers hostnames to each definition) 

define service{
     use                                 generic-service
     host_name                      client1,client2
     service_description           OS Drive
     check_command             check_nrpe!check_hda1
     }

  define service{
     use                                 generic-service
     host_name                      client1,client2
     service_description          U1 Drive
     check_command             check_nrpe!check_hda2
     }

define service{
     use                                 generic-service
     host_name                      client1,client2
     service_description          U3 Drive
     check_command             check_nrpe!check_hda3
     }

define service{
     use                                 generic-service
     host_name                      client1.client2
     service_description           U4 Drive
     check_command             check_nrpe!check_hda4
     }

define service
use                                 generic-service
     host_name                      client1,client2
     service_description          CPU Load
     check_command             check_nrpe!check_load
}

define service{
use                                 generic-service
host_name                      client1,client2
service_description Total Processes
check_command             check_nrpe!check_total_procs
        } 

define service{
use                                 generic-service
host_name                      client1,client2
service_description           Current Users
check_command             check_nrpe!check_users
}

3.       Test the new configuration.
a.      /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg\
b.     If there are no errors you can reload the nagios service in the next step.

4.       Reload the Nagios service
a.       Service nagios reload

5.       Check the web interface of nagios for the new services and run manual checks for each new service.
a.      If errors occur troubleshoot, correct, and retry.