Server Uptime Reporting and Tracking

Solution 1:

Ah, one of my favourite topics.

First, you need to define 'uptime'.

Do you mean the server is running? (in which case, just ping it regularly in a script).

Or do you mean the application is running? (connect to the application's 'home page' regularly, assuming it's a web app)

Or do you mean the application is providing the business services it is supposed to? (in which case, you need to runs some sort of synthetic transaction.

I think only the last one is in any sense correct. The others are technically easier to do, but don't really correlate with "is this server providing value to the business".

As you will see if you click on the link I added, there are many companies selling solutions that do this, or you can roll your own. I've experience with NetIQ's products, and Microsoft MOM (thw two have a shared history), but I'm sure others work as well.

When you do pick a tool, consider how to account for planned upgrades and maintenance periods - a naive approach will record these as downtime.

Also, 95% is very undemanding - it's equivalent to 72 minutes of downtime each day, or more than 8 hours a week. Try taking your server out of service for all of the working day each Thursday, say, and I think you'll discover your SLA is actually a bit more demanding than that ...

Solution 2:

I use http://mon.itor.us/ (but it is down at the moment).

Solution 3:

nagios will give you downtime reports, and is available in the standard ubuntu repositories.