Active Directory Time Synchronisation - Time-Service Event ID 50
I have an Active Directory domain with two DCs. The first DC in the forest/domain is Server 2012, the second is 2008 R2. The first DC holds the PDC Emulator role.
I sporadically receive a warning from the Time-Service
source, event ID 50:
The time service detected a time difference of greater than %1 milliseconds for %2 seconds. The time difference might be caused by synchronization with low-accuracy time sources or by suboptimal network conditions. The time service is no longer synchronized and cannot provide the time to other clients or update the system clock. When a valid time stamp is received from a time service provider, the time service will correct itself.
Time sync in the domain is configured with the second DC to synchronise using the /syncfromflags:DOMHIER
flag. The first DC is configured to sync time using a /syncfromflags:MANUAL /reliable:YES
, from a peerlist consisting of a number of UK based stratum 2 servers, such as ntp2d.mcc.ac.uk
.
I'm confused why I receive this event warning. It implies that my PDC emulator cannot synchronise time with a supposedly reliable external time source, and it quotes a time difference of >5 seconds for 900 seconds. It's worth also mentioning that I used to use a UK pool from ntp.org but I would receive the warning much more often. Since updating to a number of UK based academic time servers, it seems to be more reliable.
Can someone with more experience shed some light on this - perhaps it is purely transient? Should I disregard the warning? Is my configuration sound?
EDIT:
I should add that the DCs are virtual, and installed on two separate VMware ESXi/vSphere physical hosts.
I can also confirm that as per MDMarra's comment and best practice, VMware timesync is disabled, since:
c:\Program Files\VMware\VMware Tools\VMwareToolboxCmd.exe timesync status
returnsDisabled
.
EDIT 2
Some strange new issue has cropped up. I've noticed a pattern. Originally, the event ID 50 warnings would occur at about 1230pm each day. This is interesting since our veeam backup happens at 12 midday.
Since I made the changes discussed here, I now receive an event ID 51 instead of 50. The new warning says that:
The time sample received from peer server.ac.uk differs from the local time by -40 seconds
(Or approximately 40 seconds). This has happened two days in a row. Now I'm even more confused. Obviously the time never updates until I manually intervene.
The issue seems to be related to virtualisation and veeam. Something may be occuring when veeam is backing up the PDCe. Any suggestions?
UPDATE & SUMMARY
msemack's excellent list of resources below (the accepted answer) provided enough information to correctly configure the time service in the domain. This should be the first port of call for any future people looking to verify their configuration.
The final "40 second jump" issue I have resolved (there are no more warnings) through adjusting the VMware time sync settings as noted in the veeam knowledge base article here: http://www.veeam.com/kb1202
In any case, should any future reader use ESXi, veeam or not, the resources here are an excellent source of information on the time sync topic and msemack's answer is particularly invaluable.
Here is my recommended configuration for Windows Domain Time Synchronization, pieced together from several Microsoft TechNet articles and blog posts.
If your servers are virtualized, do not use any of the VMware tools time sync features. Just let the Windows Time Service (w32time) do its job. VMware even says so. I assume the same is true for Hyper-V. Furthermore, if you have the both the VM tools and the Windows Time Service attempting to manage the system clock, you can end up with a "tug-of-war" situation where your clock will keep jumping around and never be accurate.
Your Primary Domain Controller Emulator should be manually configured to sync with multiple external NTP servers (four is a good number). Using multiple NTP source provides redundancy and serves as a sanity check in case one server starts sending bad time data (it has happened before). The Active Directory assumes your PDCe is the central authoritative time source for your network. Everything else in your domain should sync from the PDCe (including the other domain controllers).
-
I recommend that your PDCe is a physical server (if possible). Every other server can be a VM. I feel more comfortable with the PDCe being a physical server, for two reasons:
3a. A physical server is less prone to time drift. VM time drift is a well-documented phenomenon. A virtualized server can see its clock drift by several minutes per day. Not a good choice for a time source! (Note that even on a physical server, the Real Time Clock will still drift by ~2 seconds a day without an external source. That's why you need NTP.)
3b. I know the date/time of a physical server will come up correctly after a cold power-on. I had a situation a few years ago where on a complete power down of the server room, the VMs came up with their time set to UTC instead of the local time zone. I think they pulled time from the ESX host (which was in UTC) and did not adjust for time zone properly. That caused all kinds of fun with services failing to start. Had to manually correct time and reboot everyone.
If your PDCe is currently a VM and you have a physical domain controller available, it is relatively easy to transfer the roles over.
Microsoft (and others) recommend you use Stratum 2 or Stratum 3 NTP servers as the time source for your PDCe.
While public Stratum 1 servers exist, there are a limited number of them and they get overloaded a lot. Using Stratum 1 servers as a time source when you don't really need it makes you a jerk. (Yes, there are people who really need Stratum 1. You're probably not one of them. If you really want to use a Stratum 1 source, buy a GPS clock for your local network.)
All of your external NTP sources should be in the same Stratum. Suppose you have one Stratum 2 source and a few Stratum 3 sources. The Windows Time Service will favor the Stratum 2 source. Your PDCe will become a Stratum 3 server. w32time will ignore the Stratum 3 servers (because they are no better than your PDCe). Windows will not let your server degrade to a higher/worse stratum without manual intervention (e.g. restarting the time service). So, if your Stratum 2 source goes offline, you will be stuck with no fallback.
Because the Windows Time Service is picky about the Stratum of your time source, I do not recommend using pool.ntp.org (at least not for a PDCe). There is no guarantee about the stratum of the server you get served from the pool.
Instead, I recommend you pick four Stratum 2 servers from the ntp.org list. Try to pick ones that are physically close to you (network latency hurts NTP). Verify the servers are still valid and alive (this list does change over time). Note the Microsoft default time.windows.com is notorious for problems. I would not trust my domain to it.
-
If you have been playing around with the Windows Time Service before now, or you inherited this network from someone else, it is probably a good idea to reset w32time to the default settings before you start re-configuring it. Run the following commands on your domain controllers, starting with the PDCe.
net stop w32time w32tm /unregister <-- If you get an Access Denied message, reboot. w32tm /register net start w32time
I recommend you reboot the server 1-2 times after running these commands and make sure the Windows Time Service is present, set to Automatic, and started. I have seen situations where the /unregister command did not take effect until the following reboot. Then you have a surprise when you reboot after doing Windows patches and the w32time service is suddenly missing!
To configure the Windows Time Service on your PDCe, I recommend you create a PDCe-specific GPO that uses a WMI filter for DomainRole = 5, and put all of your NTP client settings in here. Otherwise, you can use the w32tm command, or set the registry manually. See here for examples of all three methods.
Configure the PDCe to use NTP instead of NT5DS (Type = NTP in the Windows Time Service config). Otherwise, the PDCe will try to sync with itself, which won't work very well.
Enter the list of NTP servers in the Windows Time Service config (in either the GPO, Registry, or w32tm), make sure you enter the server list in this format:
server1.whatever.com,0x9 server2.otherplace.com,0x9 server3.another.com,0x9
. The 0x9 flags at the end of each server indicates to use the polling interval specified in SpecialPollInterval (0x1), and that the time sync is client-only, not a two-way sync (0x8).When configuring the PDCe NTP client, check the value of SpecialPollInterval. If your PDCe is a physical box, set it to 3600 seconds (once per hour). If your PDCe is a VM, pick something more aggressive, like every 15 min, to combat VM time drift.
In general, you should not need to mess with AnnounceFlags. The default of 10 is good for all domain controllers (PDCe or otherwise). It will automatically advertise as a time source if appropriate.
I recommend that all Domain Controllers (PDCe and otherwise) have the NTP Server enabled. I would create a GPO for Domain Controllers and enable it there. If you don't want to use Group Policy, you can do this in the Registry at HKLM\SYSTEM\CCS\Services\W32Time\TimeProviders\NtpServer\Enabled=0x1.
Make sure you have MaxPosPhaseCorrection and MaxNegPhaseCorrection set to a sane value on all your domain controllers! This will protect your domain in case one of your external NTP sources goes off in the weeds and broadcasts a wildly inaccurate timestamp (it has happened). If you are Win2008 or later, these limits should be set to 48 hours by default, but Win2003 have these set to unlimited. You can set these in the previously-mentioned Domain Controllers GPO, or do it in the Registry directly (HKLM\SYSTEM\CCS\Services\W32Time\Config).
For the domain controllers, I also recommend that you set EventLogFlags = 0x3. That will give you some additional visibility into the sync progress over time. Note that there are two EventLogFlags values to set. One is under HKLM\SYSTEM\CCS\Services\W32Time\Config (for all domain controllers). The other one is under HKLM\SYSTEM\CCS\Services\W32Time\TimeProviders\NtpClient (only relevant for PDCe). Both can be managed from Group Policy. I set them both to 0x3. (Note that I have found some discrepancies on the description of this setting between TechNet and the Group Policy description.)
Except for the one PDCe, every other Windows machine on the domain should be set to use NT5DS Domain Hierarchy for time sync. That includes all your other domain controllers, any other servers, and your workstations. NT5DS is the default for domain joined computers so you should not need to mess with it.
Note the only time-related settings I have in my domain are (1) the PDCe NTP Client GPO with the WMI filter and (2) the Domain Controllers GPO that enables NTP server, sets the Max Phase Correction Values, and EventLogFlags. All of the Group Policy time settings can be found under Computer Configuration\Administrative Templates\System\Windows Time Service. I do not have any explicit configuration in the Registry or with the w32tm command. I recommend using Group Policy for this stuff so it transcends the actual server. If you add a new domain controller in the future, or replace your PDCe, everything will "just work". Otherwise, you have to remember to manually configure the new server.
Some additional notes on the above configuration:
While it is possible to bypass the domain hierarchy and explicitly configure your clients sync to a certain server, I have had bad luck with this. I recommend you just leave everything except the PDCe at NT5DS and let the Time Service work as Microsoft intended.
Keep in mind the Windows Time Service is intended for small minor periodic corrections to your system clock. The assumption is your server's clock was set properly to begin with, and w32time will keep it that way. If your server gets too far out of sync with your external NTP sources, it will pretty much "give up". If you have followed the recommendations above, you should stay closely in sync with your external time sources. However, if you have a VM environment with really bad time drift (an overloaded VM host, constant snapshots), you may still get out of sync. If so, there are several settings for "spike detection" that you can tweak. It is probably a band-aid on another problem in your environment though. Make absolutely sure you have implemented all of the recommendations above before you dig deeper into the settings!
Applying the configuration changes and checking everything:
If you used Group Policy to configure the time service, the change should propagate to all your Domain Controllers shortly. You can run the
gpupdate /force
command on each domain controller (starting with the PDCe) make it happen immediately.If you decide not to use Group Policy and manually configure the time service with w32tm or by editing the registry, make sure you run
w32tm /config /update
on each affected server and then restart the service (starting with the PDCe). Otherwise your settings will not take effect!Next up, run
w32tm /resync /rediscover
on the PDCe. Wait a few minutes, then look at the Event Viewer for problems. There may be some error/warning messages from unregistering/registering the time service, but after that everything should be golden. You should see messages about getting valid time data from your NTP servers. Once you are sure the PDCe is good, go onto the other domain controllers and run the same commands.Once the time service is syncing on all the DCs, you can do a
w32tm /monitor
. Make sure the domain controllers are listed and their RefID and Stratum looks correct. If you are using Stratum 2 servers, your PDCe should be Stratum 3. You can also runw32tm /query /status /verbose
(Win2008 or later only) and watch the last updated time. Make sure it updates as expected.Once your domain controllers are in order, run
w32tm /resync /rediscover
on some workstations and member servers. Check the Event Viewer for errors. If you have messed with the time service on other workstations, you may have to run the w32tm unregister/register commands on them as well.
Follow Up:
For completeness, you should make sure all of your non-Windows NTP clients (routers, switches, print servers, etc) are pointed at your domain controllers as a time source. I recommend setting up a CNAME DNS entry for ntp.yourdomain.com that points to yourdomain.com. That way you don't have to explicitly list domain controller names or IP addresses on all your devices, which will help when you add/retire servers in the future. Your non-windows NTP clients will use whichever domain controller comes up in the round-robin DNS. (Note this only works if you enabled the NTP server on all your domain controllers.)
Also, on your DHCP server(s), make sure scope option 42 is configured to point to your domain controllers. Any DHCP-configured device that supports option 42 will sync time with the domain controllers automatically.
Sources For My Info:
- Good overview of setting up a PDCe
- Windows Time Service Technical Reference
- Another example of a PDCe GPO
- How to set a list of NTP Servers and use SpecialPollInterval Part 1
- How to set a list of NTP Servers and use SpecialPollInterval Part 2
- W32Time Registry Settings
- W32Time Group Policy Settings
- MaxPosPhaseCorrection and MaxNegPhaseCorrection
- Debugging W32Time
- Info on AnnounceFlags
- A blog with more than you ever wanted to know about w32time
- Oh crap, a bad NTP server caused a time rollback! Part 1
- Oh crap, a bad NTP server caused a time rollback! Part 2
- VMware Best Practices for Timekeeping in Windows
- VMware info on Virtualizing a Domain Controller
You may want to see what shows up in the w32tm debug log when the event occurs.
To enable debug logging:
w32tm /debug /enable /file:C:\Windows\Debug\w32tm.log /size:50000000 /entries:0-300
More information:
http://blogs.msdn.com/b/w32time/archive/2008/02/28/configuring-the-time-service-enabling-the-debug-log.aspx
Regarding if your configuration is sound, if your DC is not advertising as a time server for extended periods of time, that would be a problem. You can spot-check if your DC is advertising using the command:
nltest /dsgetdc:domain.com
It should have the TIMESERV flag.
You can view detailed information about the time service using the command:
w32tm /query /status /verbose
Sample output:
Leap Indicator: 0(no warning)
Stratum: 2 (secondary reference - syncd by (S)NTP)
Precision: -6 (15.625ms per tick)
Root Delay: 0.0312500s
Root Dispersion: 0.0314141s
ReferenceId: 0x81060F1E (source IP: 129.6.15.30)
Last Successful Sync Time: 3/25/2014 11:55:30 AM
Source: time-c.nist.gov
Poll Interval: 7 (128s)
Phase Offset: 0.0000667s
ClockRate: 0.0156001s
State Machine: 2 (Sync)
Time Source Flags: 0 (None)
Server Role: 64 (Time Service)
Last Sync Error: 0 (The command completed successfully.)
Time since Last Good Sync Time: 97.2535519s
State Machine should be 'Sync', and Server Role should be 'Time Service'. If Source is not an external NTP server in the peer list, that would be an issue. Known bad source entries to watch out for would be 'Free Running System Clock' or 'Local CMOS Clock'.