Throttle nagios alerts if host loses connectivity

We use nagios to monitor our server farm, and generally it works great. From time to time, though, the host where nagios runs loses connectivity for a couple of minutes, which makes nagios believe that all servers and services it monitors are down. The result is hundreds of alert mails, shortly followed by hundreds of recovery mails.

Is there any way to configure nagios in such a way that it tests its own connectivity before releasing an avalanche of alert mails?


Yes, you can set parents and childs. If a parent is down, no notification about the child is given. You do need to set the timings properly though (in generic_service and generic_host or whatever templates you use), because when the services are no longer available, it needs to have decideded the parent is down before it would send notifications out for those services.

What I did, is this:

# ISP gateway (first in traceroute)
define host {
        host_name   kpn-gateway
        alias       KPN Gateway
        address     1.2.3.4
        use         generic-host
        notification_period  never
        parents     experia
}

# gateway in datacenter
define host {
        host_name   duocast-gateway
        alias       Duocast gateway
        address     5.6.7.8
        use         generic-host
        parents     kpn-gateway
        contact_groups bla
}

# one of the hosts in datacenter.
define host {
        host_name   brick
        alias       host.example.com
        address     a.b.c.d
        use         generic-linux-host
        parents     duocast-gateway
        contact_groups geborsteldstaal
}