How to synchronize time on ESXi Windows virtual machines within one second?

I'm a developer and we are using Quartz.Net, widely used scheduling library with SQL backing store to run cluster of jobs servers (VMs on ESXI cluster).

Quartz.Net requires that time will be synchronized between job server instances and recommends using NTP for it.

The clocks must be within a second of each other.

Our sysadmins using windows NTP to sync time with domain controller. Synchronization of VMs with ESXI host is off.

They keep insisting that's "within second" is not correct requirement and that cannot be met w/o hardware GPS-syncing devices. Their SLA & monitoring level are "within 3 minutes".

We are experiencing periodic (once in 2-3 months) Quartz instances out-of-sync behavior that consistent with time being out of sync.

  1. Is it correct for us to ask for "within second" or we need to ditch Quartz entirely?
  2. If yes, what changes are recommended for our setup?

This is 2018. Windows is capable of keeping servers synchronized within 2 ms or so, as required by MIFID II Regulations. So, your problem is a non-problem.

Our sysadmins using windows NTP to sync time with domain controller. Synchronization of VMs with ESXI host is off.

Why? The host can handle this a lot better (being hardware) and you have a lot fewer. Your sysadmins shoot themselves in the foot, then complain they are bleeding.

They keep insisting that's "within second" is not correct requirement and that cannot be met w/o hardware GPS-syncing devices. Their SLA & monitoring level are "within 3 minutes".

OLD - ancient - Windows synchronized within that timeframe because the Kerberos tickets had a 5 minute validity.

But this is, as I said, 2018. The financial industry has quite brutal requirements these days and MS has handled that for - since 2012, I think. 2016 put it fully into effect. Millisecond accuracy over the internet is a solved problem - solved 50 years ago actually, for a decent connection. NTP can handle it. You may have to put up a cheap hardware box if you want to cut down on traffic (i.e. make your own tier 3 NTP time source), but that again is not even expensive.

Is it correct for us to ask for "within second" or we need to ditch Quartz entirely?

You need to program for occasional time issues - as you would do with hardware. But "within second" is a joke of a requirement - it is trivial to meet under normal circumstances.

Some references:

https://docs.microsoft.com/en-us/windows-server/networking/windows-time-service/accurate-time

Government Regulations like: 50 ms accuracy for FINRA in the US 1 ms ESMA (MiFID II) in the EU.

Lots of detail and instructions there. This is an amazing read actually if you have to solve this problem. You may have to upgrade your hypervisor - they talk all about Hyper-V. VMWare should be able to do the same, but not sure how old your version is.


Is it correct for us to ask for "within second" or we need to ditch Quartz entirely?

There are lots of very good reasons for various application stacks to need tight time control and what Quartz are asking for is far from unusual.

If yes, what changes are recommended for our setup?

The best bet is to make every single part of your system use NTP and point them to the same pair of NTP servers. So ESXi hosts and the VMs running on them, all using the same NTP sources, same for anything else involved. This way even if the NTP servers are 'off time' then at least every part of your system is up-to-date with each other.


https://docs.microsoft.com/en-us/windows-server/networking/windows-time-service/support-boundary

High Accuracy support for Windows 8.1 and 2012 R2 (or Prior)

Earlier versions of Windows (Prior to Windows 10 1607 or Windows Server 2016 1607) cannot guarantee highly accurate time. The Windows Time service on these systems:

  • Provided the necessary time accuracy to satisfy Kerberos version 5 authentication requirements

  • Provided loosely accurate time for Windows clients and servers joined to a common Active Directory forest

Tighter accuracy requirements were outside of the design specification of the Windows Time Service on these operating systems and is not supported.

Windows 10 and Windows Server 2016

Time accuracy in Windows 10 and Windows Server 2016 has been substantially improved, while maintaining full backwards NTP compatibility with older Windows versions. Under the right operating conditions, systems running Windows 10 or Windows Server 2016 and newer releases can deliver 1 second, 50ms (milliseconds), or 1ms accuracy.

Target Accuracy: 1 Second (1s)

To achieve 1s accuracy for a specific target machine when compared to a highly accurate time source:

  • The target system must run Windows 10, Windows Server 2016.

  • The target system must synchronize time from an NTP hierarchy of time servers, culminating in a highly accurate, Windows compatible NTP time source.

  • All Windows operating systems in the NTP hierarchy mentioned above must be configured as documented in the Configuring Systems for High Accuracy documentation.

  • The cumulative one-way network latency between the target and source must not exceed 100ms. The cumulative network delay is measured by adding the individual one-way delays between pairs of NTP client-server nodes in the hierarchy starting with the target and ending at the source. For more information, please review the high accuracy time sync document.

https://docs.microsoft.com/en-us/windows-server/networking/windows-time-service/configuring-systems-for-high-accuracy