What is the strategy for detecting time drift in a linux based data centre?

This is easy to control. Configuration management is the key...

Ensure that the ntp service is running and configured...

For example, using Monit to make sure ntpd is running and to restart it if it fails is an easy approach... It may make sense to add cron and other essential daemons to that sort of check.

Another option is using a configuration management tool like Puppet to force the same ntpd.conf to your servers and ensure that ntpd is installed, configured and running.

There are enough redundancies in the NTP protocol to deal with the instance of a time server being unreachable. Specify multiple sources.


There are a variety of check_ntp plugins for nagios out there.

Here's one:

http://nagiosplugins.org/man/check_ntp

Add this check to your nagios host and get alerts if anything goes awry.