Nagios server is not detecting if client vm's are rebooted

Solution 1:

As some others have stated in their comments, Nagios is not detecting the servers are unavailable while they are rebooting because they are taking very little time to do it.

To check whether a server has been rebooted, you can write your own plugin. You just have to save the server's uptime in a temporary file and check the current uptime vs the old one. If the current uptime is lower than the saved one, then the plugin will return a critical status.

You can also use the check-uptime plugin (https://exchange.nagios.org/directory/Plugins/System-Metrics/Uptime/check-uptime/details) which would return a critical status when the uptime is less than, for example, 5 minutes. That way, you will receive a notification when the server's uptime goes below 5 minutes, which means it has been rebooted.

Use this script instead if you need to check for uptime in seconds:

#!/bin/bash
CRIT_VALUE=$1
if [[ "$CRIT_VALUE" == "" ]]
then
  # if any parameter is missing it will print it out and exit.
        echo "No argument supplied or argument missing."
        echo "Usage: ./uptime.sh <critical value in seconds>"
        echo "Example: ./uptime.sh 300"
        exit 1
else
  since=$(date -d "$(uptime -s)" +%s)
  now=$(date +%s)
  seconds_uptime=$(( now - since ))
  if [[ "$seconds_uptime" -le "$CRIT_VALUE" ]]; then
    echo "CRITICAL! System rebooted $(( seconds_uptime / 60 )) minutes ago."
    exit 2
  fi
  echo "OK. Up since $(date -d "$(uptime -s)")"
  exit  0
fi