Nagios custom variables for object inheritance
In our Nagios setup we're using templates and object inheritance for services and hosts.
#Le Hosts
define host{
use linux-nrpe,linux-dc3,linux-cassandra
host_name tigris
alias tigris
address 192.168.4.72
}
define host{
use linux-nrpe,linux-dc3,linux-cassandra
host_name euphrates
alias euphrates
address 192.168.4.177
}
#Le Templates
define host{
name linux-nrpe
use all-hosts
hostgroups linux-nrpe
contact_groups rhands,usergroup1,opcomms
register 0
}
#Le Services
define service{
hostgroup_name linux-nrpe
use high-priority-service,graphed-service
service_description Load
check_command check_by_nrpe!check_load!5,5,6!9,9,9
contact_groups rhands,usergroup1,opcomms
}
[...etc...]
The problem with this setup is all servers in the linux-nrpe
group trigger alerts when their load levels hit whatever is defined in the service, but our workhorse servers might run 24/7 at a load of 20 but our DB servers sit quite happily at ~1 unless something goes wrong, so we find the system sending out too many alerts or having to ignore/not alert on things. Defining individual service definitions for each server (lots of them) would take ages, what we'd really like to do is something like
define host{
name linux-nrpe
use all-hosts
hostgroups linux-nrpe
contact_groups rhands,usergroup1,opcomms
register 0
perf_load 2,2,3 5,5,6
perf_mem 95% 97%
[...more...]
}
define service{
hostgroup_name linux-nrpe
use high-priority-service,graphed-service
service_description Load
check_command check_by_nrpe!check_load!$perf_mem$
contact_groups rhands,usergroup1,opcomms
}
I looked through the docs and couldn't see anything, unless I'm missing something. Any ideas?
We have a quite similar solution running here in our Nagios Monitoring. Custom Host/Service Variables have to start with an underscore on definition and on reference you have to add _HOST or _SERVICE as prefix and all uppercase as name.
Therefore you perf_load and perf_mem custom variable has to be defined as
define host {
[..]
_perf_load 2,2,3 5,5,6
_perf_mem 95% 97%
[..]
}
and referenced as
define service {
[..]
check_command check_by_nrpe!check_load!$_HOSTPERF_LOAD$
[..]
}
A snippet from a running config of our Nagios:
define host {
host_name target
alias target
address target
use tmpl_host
_gprs_address 192.168.0.1
}
[...]
define service {
host_name target
service_description GPRS ping
use tmpl_service_ping
check_command check_fping-by-ssh!-H 1.2.3.4 -S $_HOSTGPRS_ADDRESS$ -n 7 -t 1000 -w 1000 -c 2000
event_handler check_restart-GPRS-PPP
notes_url https://wiki.
contact_groups admin_allday
}
You find more details in the Nagios Documentation.
For the reference, this work also fine in Icinga.
You can also define the thresholds in the NRPE config, on the hosts themselves. This isn't practical if you have more than a few dozens hosts, unless you have some sort of conf management (something like puppet, or even just git/hg/svn/whatever) and use 'includes' in nrpe.cfg.
Lairsdragon's suggestion is much better, though. The one thing I would add is:
It can be helpful to name custom object vars with two leading underscores ($__FOO), so they can be called as "$_HOST_FOO".