Nagios server is not detecting if client vm's are rebooted

Solution 1:

As some others have stated in their comments, Nagios is not detecting the servers are unavailable while they are rebooting because they are taking very little time to do it.

To check whether a server has been rebooted, you can write your own plugin. You just have to save the server's uptime in a temporary file and check the current uptime vs the old one. If the current uptime is lower than the saved one, then the plugin will return a critical status.

You can also use the check-uptime plugin (https://exchange.nagios.org/directory/Plugins/System-Metrics/Uptime/check-uptime/details) which would return a critical status when the uptime is less than, for example, 5 minutes. That way, you will receive a notification when the server's uptime goes below 5 minutes, which means it has been rebooted.

Use this script instead if you need to check for uptime in seconds:

#!/bin/bash
CRIT_VALUE=$1
if [[ "$CRIT_VALUE" == "" ]]
then
  # if any parameter is missing it will print it out and exit.
        echo "No argument supplied or argument missing."
        echo "Usage: ./uptime.sh <critical value in seconds>"
        echo "Example: ./uptime.sh 300"
        exit 1
else
  since=$(date -d "$(uptime -s)" +%s)
  now=$(date +%s)
  seconds_uptime=$(( now - since ))
  if [[ "$seconds_uptime" -le "$CRIT_VALUE" ]]; then
    echo "CRITICAL! System rebooted $(( seconds_uptime / 60 )) minutes ago."
    exit 2
  fi
  echo "OK. Up since $(date -d "$(uptime -s)")"
  exit  0
fi

Ansible: clone repo or install helm chart from private github

Mounting Google Cloud Filestore to a machine on a separate network (non google), connected through ipsec tunnel

LDAP schema objectclass multiple inheritance confuses me

I have a rewrite in an apache httpd conf file, that breaks certbot. Is there a way to change it so that it doesn't?

ab Failed Requests

mysql crashed and not starting even after adding innodb_force_recovery

OpenVPN WARNING: Failed running command (--client-connect): could not execute external program

I have no access to router and portforward and i need to run a webserver [closed]

How to find out why Windows Server feature installation failed?

Find a userA who has GenericAll rights over the userB using Active Directory Recon

How to convert an epoch time in millisecond into an ISO like string according to local time?

Google Spreadsheet importing CSV with arbitrary date formatting?