Backup Monitoring Solution [closed]

I am a nagios user, and so my method for monitoring my backups (Veeam, Windows Server Backup, Dirvish, mysqldump script,etc) basically amounts to using a passive service, and send_nsca to report the results of a check.

Basically for each backup technology I have written a script that is either called as part of the backup finished hook in the backup system, or scheduled to run after completion. The various scripts check exit status, logs, size and so on, and then reports the status to the nagios passive service. The service is set to alert if no status update is received within a defined time period. Exactly what you need to monitor about each service is going to depend on your environment, and the software.

I am sure other monitoring systems must have similar things to a passive service.


So for me, it's always been a lazy man's approach here if you are the only person responsible for monitoring the backups and that instant notification isn't a requirement, ie you don't need to fix the issue right when it occurs.

What I do is setup each of the backup software packages to send notifications via email to me.

Then I did the following:

  1. Create a rule to move these notifications to a different folder in Outlook
  2. Create another rule to search either the subject or body or similar (depends on the backup software's way of formatting the notifications) to look for things like "failed" or "incomplete" or similar and have that rule flag the message follow for Today. I also have it create a copy in another folder called "Failed Backup Notifications".

Then when I check my email the next morning I can immediately see any flags or just check the "Failed Backup Notifications" folder.

(One note of caution. I STILL recommend some other means of manually checking the backup programs at least once a week. I say this because if the backups aren't running AT ALL even those the service/job scheduler seems to be running then you won't even get notification emails. You should also have rules setup to alert you if you get an email such as "the Backup job engine service has stopped/failed" or similar)