Risk of starting NTP on database server?
I've heard rumors of bad things happening to database and mail servers if you change the system time while they are running. However, I'm having a hard time finding any concrete information on actual risks.
I have a production Postgres 9.3 server running on a Debian Wheezy host and the time is off by 367 seconds. Can I just run ntpdate
or start openntp while Postgres is running, or is that likely to cause an issue? If so, what is a safer method of correcting the time?
Are there other services that are more sensitive to a change in system time? Maybe mail servers (exim, sendmail, etc) or message queues (activemq, rabbitmq, zeromq, etc)?
Solution 1:
Databases don't like backward steps in time, so you don't want to start with the default behavior of jumping the time. Adding the -x
option to the command line will slew the time if the offset is less than 600 seconds (10 minutes). At maximum slew rate it will take about a day and half to adjust the clock by a minute. This is a slow but safe way to adjust the time.
Before running ntp
to adjust the time, you may want start ntp
with a option like -g 2
to verify how large an offset it is detecting. This will set the panic offset to 2 seconds which should be relatively safe.
An alternative option I have used before this option was available was to write a loop that reset the clock back part of second every minute or so. If you check to ensure the reset won't change the second this is likely safe. If you use timestamps heavily, you may have out of sequence records.
A common option is to shutdown the server long enough that there is no backward movement of the clock. ntp
or ntpdate
can be configured to jump the clock to the correct time at start up. This should be done before the database is started.
Solution 2:
Databases can be especially vulnerable to system time changes if they are very active and have timestamps on internal records. In general, if you're time is behind, you'll have much fewer problems if you suddenly jump forward than if you're ahead and suddenly jump backwards.
As Joffrey points out - it's much more often the application that has issues with sudden time jumps than the database itself. The safest way to correct the time is to shut down the application for N+1 minutes (where N is the number of minutes your system clock is ahead) and then sync time, start NTP, and restart the application. If you can't take that much downtime in the application, I can only suggest you take a backup of the database before syncing time, then offer up a dead squirrel to the goda of computerdom and just pull the trigger. Ok, I'm being a bit facetious, but I can't think of any other "safe" way than taking an application outage.
Solution 3:
It is usually not the database server which is vulnerable to error when an instant time leap occurs: its the applications that use the time that are.
There are generally two ways of tracking time: own time tracking or comparing system time. Both have some positive and negative tradeoffs.
Own time tracking
I see this used in some embedded programming and systems where exact timing is not that critical. In a main application loop a way of tracking a 'tick' is taken care of. This could be an alarm given by the kernel, sleep or select that gives an indication of the amount of time passed. When you know what time is passed you know you can add or subtract this time to a counter. This counter is what makes your timing application happen. For example, if the counter is higher than 10 seconds you can discard something, or you need to do something.
If the application does not keep track of time, the counter will not change. This could be desired depending on the design of your application. For example, keeping track on how long a long-running process is taking something is handled is easier with a counter than a list of start/stop timestamps.
Pro:
- Not dependent on system clock
- Will not break on a big time skew
- No costly system call
- Small counters will cost less memory than a full timestamp
Con:
- Time is not very accurate
- Change in system time could make it even more inaccurate
- Timing is relative to running the application, does not persist
Comparing system time
This is the system used more often: store a timestamp and compare it with the timestamp using a system time call. Huge skews in the system time could threaten the integrity of your application, a task of a few seconds could take hours or end immediately depending on the direction of the clock.
Pro:
- Accurate time comparison
- Persists over restarts and long outages
Con:
- Takes a system call to get a fresh timestamp to compare with other timestamps
- Application needs to be aware of skews or can break
Affected systems
Most of the applications will use timestamp comparing to schedule tasks. For database systems that could be cache cleanups.
All applications that use a database and call time functions in the query language will be affected by skews if the application does not detect and handle accordingly. Applications could never stop running or allow indefinite login periods depending on its purpose.
Mail systems will use timestamps and/or timeouts for handling stale or undelivered mails. A clock skew could affect that but with a much lesser impact. Back-off timers regarding reconnecting to servers could be missed resulting in penalties on the connecting server.
I do not think (have not researched) that kernel alarms will go off when changing the system time. Systems that use these could be safe.
Solutions
Gently move time. This can be found in documentation of your favorite time solution.