Snmp process check

Script : check_snmp_process.pl

Last update : Jun 09 2007

Description :

Checks by snmp v1 or v3 if a process is running and how many instances are running (minimum & maximum).
It is also possible to check memory and cpu used by one or a group of process

Works on Windows, Linux/Unix, AS400.

Vérifie par snmp v1 ou v3 si un process tourne et combien d'instances de ce process tournent (minimum et maximum).
Il est également possible de vérifier la mémoire et le cpu utilisé.

Standard checks

The plugin checks if there is at least one process matching the filter (-n option) when no warning or critical levels are set.
The filter is treated as a regular expression by default, but you can deactivate this (-r)

With the following options, you can add to your process selection :

-f : get full path of the script instead of only it's name

-A : add parameters with the script name

Option how the script will see the process
None named
-f /usr/sbin/named
-A named -u named -t /var/named/chroot
-f -A /usr/sbin/named -u named -t /var/named/chroot

Warning : the -f & -A option will not function properly for Windows hosts (the snmp agent don't give this information)

You can use -w and -c options to set the warning and critical levels :
-w <minW>,<maxW> : with minW and maxW the minimum and maximum number of processes.

-c <minC>,<maxC> : same thing
Of course : minC <= minW < maxW <=maxC

You can omit <maxW> and <maxC>

Saying N is the current number of processes
- N < minC : critical
- minC < N <=minW : warning
- minW< N <= maxW : OK
- maxW< N <= maxC : warning
- maxC < N : critical

Memory checks

The -m option can check the memory used by the selected processes.
By default, this will select the process wich use the maximum memory. The -a switch will make an average

Ex : -m 7,20 will send a warning if a process uses more than 7 Mb, and critical for more than 20Mb.

CPU checks

When you use the -u option, a temporary file will be created in "/tmp" by default : this can be changed at the beginning of the script.
The file name will be : tmp_Nagios_proc.<host IP>.<process filter>.

The -u option will add all the cpu used by all selected process and the make the check

-u 91,95 : will send a warning if more than 91% of cpu is used, and critical if more than 95% is used.

On multiprocessor hosts, the % of cpu use can be > 100% : on a 4 CPU host, cpu usage can go up to 400% (the script doesn't check if a host is multiprocessor or not).

The script curently wants a minimum of 5 minutes between values taken from host (can be changed at the beginning of the scripts). You can check more than once every 5 minutes but don't put check-interval to more than 15 minutes.
When the script doesn't have enough data to compute the CPU use (for example, the first time it is run), then it will return a UNKNOWN status.

Msg size option (-o option)

In case you get a "ERROR: running table : Message size exceeded maxMsgSize" error, you may need to adjust the maxMsgSize, i.e. the maximum size of snmp message with the -o option. Try a value with the -o AND the -v option : the script will output the actual value so you can add some octets to it with the -o option.

SNMP Login

See snmp info page

Requirements :

- Perl in /usr/bin/perl - or just run 'perl script'
- Net::SNMP
- file 'utils.pm' in plugin diretory

Dowload latest version : 1.5

Configurations examples

Changelog : On CVS repository on sourceforge : http://nagios-snmp.cvs.sourceforge.net/nagios-snmp/plugins/.

Examples :

All examples below are considering the script is local directory. Host to be checked is with snmp community "public".

If multiple interfaces are selected, all must be up to get an OK result

Get help

./check_snmp_process.pl -h

snmpv3 login ./check_snmp_process.pl -H -l login -x passwd

Check if at least one process matching http is running

./check_snmp_process.pl -H -C public -n http

Result example :

3 process matching http : > 0 : OK

Check if at least 3 process matching http are running

./check_snmp_process.pl -H -C public -n http -w 2 -c 0

Result example :
(<=2 will return warning, 0 critical)
3 process matching httpd : > 2 : OK
Check if at least one process named "httpd" exists (no regexp) ./check_snmp_process.pl -H -C public -n http -r

Result example :

3 process named httpd : > 0 : OK
Check process by their full path : check process of /opt/soft/bin/ (at least one) ./check_snmp_process.pl -H -C public -n /opt/soft/bin/ -f
Check that at least 3 process but not more than 8 are running ./check_snmp_process.pl -H -C public -n http -w 3,8 -c 0,15
Same checks + checks maximum memory used by process (in Mb) : warning and critical levels ./check_snmp_process.pl -H -C public -n http -w 3,8 -c 0,15 -m 9,25
Same check but sum all CPU used by all selected process ./check_snmp_process.pl -H -C public -n http -w 3,8 -c 0,15 -m 9,25 -u 70,99

Output of check_snmp_process.pl -h

SNMP Process Monitor for Nagios version 1.5
GPL licence, (c)2004-2006 Patrick Proy

Usage: ./check_snmp_process.pl [-v] -H <host> -C <snmp_community> [-2] | (-l login -x passwd) [-p <port>] -n <name> [-w <min_proc>[,<max_proc>] -c <min_proc>[,max_proc] ] [-m<warn Mb>,<crit Mb> -a -u<warn %>,<crit%> -d<delta> ] [-t <timeout>] [-o <octet_length>] [-f -A -F ] [-r] [-V] [-g]
-v, --verbose
print extra debugging information (and lists all storages)
-h, --help
print this help message
-H, --hostname=HOST
name or IP address of host to check
-C, --community=COMMUNITY NAME
community name for the host's SNMP agent (implies SNMP v1 or v2c with option)
-l, --login=LOGIN ; -x, --passwd=PASSWD, -2, --v2c
Login and auth password for snmpv3 authentication
If no priv password exists, implies AuthNoPriv
-2 : use snmp v2c
-X, --privpass=PASSWD
Priv password for snmpv3 (AuthPriv protocol)
-L, --protocols=<authproto>,<privproto>
<authproto> : Authentication protocol (md5|sha : default md5)
<privproto> : Priv protocole (des|aes : default des)
-p, --port=PORT
SNMP port (Default 161)
-n, --name=NAME
Name of the process (regexp)
No trailing slash !
-r, --noregexp
Do not use regexp to match NAME in description OID
-f, --fullpath
Use full path name instead of process name
(Windows doesn't provide full path name)
-A, --param
Add parameters to select processes.
ex : "named.*-t /var/named/chroot" will only select named process with this parameter
-F, --perfout
Add performance output
outputs : memory_usage, num_process, cpu_usage
-w, --warn=MIN[,MAX]
Number of process that will cause a warning
-1 for no warning, MAX must be >0. Ex : -w-1,50
-c, --critical=MIN[,MAX]
number of process that will cause an error (
-1 for no critical, MAX must be >0. Ex : -c-1,50
Notes on warning and critical :
with the following options : -w m1,x1 -c m2,x2
you must have : m2 <= m1 < x1 <= x2
you can omit x1 or x2 or both
-m, --memory=WARN,CRIT
checks memory usage (default max of all process)
values are warning and critical values in Mb
-a, --average
makes an average of memory used by process instead of max
-u, --cpu=WARN,CRIT
checks cpu usage of all process
values are warning and critical values in % of CPU usage
if more than one CPU, value can be > 100% : 100%=1 CPU
-d, --delta=seconds
make an average of <delta> seconds for CPU (default 300=5min)
-g, --getall
In some cases, it is necessary to get all data at once because
process die very frequently.
This option eats bandwidth an cpu (for remote host) at breakfast.
-o, --octetlength=INTEGER
max-size of the SNMP message, usefull in case of Too Long responses.
Be carefull with network filters. Range 484 - 65535, default are
usually 1472,1452,1460 or 1440.
-t, --timeout=INTEGER
timeout for SNMP in seconds (Default: 5)
-V, --version
prints version number
Note :
CPU usage is in % of one cpu, so maximum can be 100% * number of CPU
example :
Browse process list : <script> -C <community> -H <host> -n <anything> -v
the -n option allows regexp in perl format :
All process of /opt/soft/bin : -n /opt/soft/bin/ -f
All 'named' process : -n named

This project is hosted on :
SourceForge.net Logo

Nagios and the Nagios logo are registered trademarks of Ethan Galstad.