Is there research material on NTP accuracy available?
No-one can guarantee how well NTP will work on your network, because no-one knows how well connected your network is to the internet, and to the clock servers thereon. However, according to the clock discipline algorithm page on ntp.org
If left running continuously, an NTP client on a fast LAN in a home or office environment can maintain synchronization nominally within one millisecond. When the ambient temperature variations are less than a degree Celsius, the clock oscillator frequency is disciplined to within one part per million (PPM), even when the clock oscillator native frequency offset is 100 PPM or more.
Note that large-but-stable latency between your LAN and the internet's clock servers doesn't have as bad an effect on accuracy as highly-variable latency.
You don't say where you got the estimates above ('50 microseconds to ... "below one second"'), so I can't comment on them, but in my experience 50us is unlikely unless you have a directly-attached clock source, and 1s is unlikely unless you have a piece of wet string connecting you to the internet and you're using upstream servers in Antarctica.
Edit: the text you now quote in your question gives a pointer to a paper which, in 1999, did indeed establish that 99% of ntp servers are synch'ed to within one second. Fortunately, there is more recent work; in this paper some authors from the Federal University of Parana, Brazil, repeated the experiment in 2005, and found (if I understand their Fig. 1 correctly) that north of 99% - more like 99.5% - of servers now have offsets less than 100ms, and that 90% have offsets less than 10ms. This fits in pretty well with my experiences (see above).
Edit 2: one last wrinkle: all these studies don't investigate how accurate local clock is, but instead how far it differs from the upstream reference clock. These are patently not the same thing. But the first is unknowable; to know how wrong your clock is, you have to know exactly what time it is, and if you knew that, why would you have set your clock wrong in the first place? Just be aware that what these studies are measuring is not the difference between local clock and absolute time, but between local clock and reference clock.
What problem are you trying to solve?
The solution I've encountered for environments requiring more precision than NTP is the Precision Time Protocol (PTP). I've had it in scientific computing and financial computing applications. There are tradeoffs, though.
Also see: ptp time synchronization on centos6/rhel
A few other things worth mentioning:
- You'll be lucky to get < 100ms of clock jitter on a virtual machine, so all the below is for a physical host
- Sub 100ms jitter is nearly immeasurable for almost every task, and easily achievable over the Internet
- Sub 30ms jitter may be needed for some general serving environments (I needed it for log correlation at a previous job), and is easily achieved using NTP servers on the same continent where the connection is not via "consumer" links (eg, not satellite, ADSL, DOCSIS, GPON, UMTS/LTE/HSPA/etc)
- For absolute accuracy below this you should be installing hardware NTP servers from a quality vendor (eg, Symmetricom)
- Sub 10ms (often sub 1ms) local agreement can easily be achieved by simply having a trio (you can do with less, but there's reasons to use three or five) within the same datacenter enough for pretty much every non-Science application
Vested interest on my part: I'm a Meinberg agent :-)
Yes NTP can achieve an end-to-end precision down to approx. 50 us (that's microseconds) of jitter, if you sync a linux "client" on bare metal running Chrony or ntpd, to a Linux-based NTP server disciplined by a GPS, local atomic clock or some such source.
On the machine that has a local GPS (with a PPS interconnect), you will probably see 0-2 microseconds of offset, between the ntpd instance running in the OS, and its PPS refclock driver's input.
Those residual 50 us "end to end over a LAN" are a result of several stages of buffering, variable IRQ latency, other traffic interfering on the LAN and on the computer busses involved and whatnot. 50 us means a LAN with very little traffic. Even just a switch can add some microseconds of jitter - and higher-end switches with complex features add more latency and jitter. In other words, it can be pretty difficult to achieve those 50 microseconds in your real-world conditions on some practical LAN.
Similarly, those cca <2us of the PPS offset result from just the IRQ latency uncertainty and general bus latency jitter on well behaved PC hardware.
Note that NTP and its implementations ntpd and Chrony certainly measure NTP transaction round-trip time and subtract (add, actually) a half of that round trip, as a measure to filter away the systematic transport latency (one way). They also perform outlier rejection, quorum consensus, syspeer election and any NTP demon filters the responses it gets to its upstream queries. So as others have said, the milliseconds that you see in Ping and Traceroute do not directly offset your local clock. What matters is the variability of transaction round-trip, i.e. other traffic on the path to your upstream NTP server. Ntpq -p is your friend.
A basic GPS receiver for timing use, with a TCXO, can have maybe 100-200 ns of residual jitter+wander on its PPS output. Plenty good enough for NTP, as long as the GPS stays locked. (Holdover performance is not very good with TCXO's.) A quality timing GPS with an OCXO can be well within 100 ns, maybe more like 10-30 ns of residual error (offset from the global UTC).
Note that actual satellites flying overhead and beaming at you through an atmosphere may be a slightly tougher game for the receiver, than benchmarking in a lab with a GPS generator.
PTP is a hammer. You need HW support in the grandmaster, and in the slaves, and in any switches - but if you get all that, residual offsets down to low double digit of nanoseconds are possible. I have personally seen this in ptp4l running with an i210 NIC which has HW support (timestamping with a nanosecond resolution).
The i210 chip is a wonder. It has 4 general-purpose pins that can be used to input or output a PPS signal. The reference Intel addon NIC board with i210 (and its OEM versions from several big vendors) comes equipped with a pin header that gives you access to at least 2 of those GPIO pins (SDP's they're called by Intel). Apart from implementing a PTP grandmaster port, the PPS input can be leveraged for precise timestamping in packet capture. You need a precise source of PPS and a custom piece of software to run a servo loop, fine-tuning the i210's PHC to the ext.PPS. On my test rig, this resulted in single digit ns (per 1 s iteration) of residual offset. This is the precision that you then get in your capture timestamps, if you run a recent tcpdump or wireshark on a modern Linux kernel (all the software needs support for nanosecond-level resolution). Better yet: I went all the way and built a simple PLL synth to produce 25 MHz for the NIC clocks, locked to a precise upstream 10MHz reference. After that, the residual offset in the servo loop of my packet capture rig dropped to a clean 0 (a proof that my 10 MHz reference is phase-synchronous with the PPS from that same GPS box).
Note that PTP grandmasters may be specified to provide timestamps with an actual granularity per 8 ns (in a data type with 1 ns resolution). This makes sense - gigabit Ethernet tends to use a 125 MHz clock, used as a byte clock in the internals of the MAC, this clock is probably also used in the GMII, and it's also the symbol clock in metallic 1000Base-TX (four pairs in parallel, 2 bits per symbol per pair). So unless you're using 1000Base-FX (fiber optic) with SERDES and an extremist implementation of the HW timestamping unit in the PHY that works down to individual SERDES bits, those 8 ns are all you can ever realistically hope for on gigabit Ethernet. Some chip datasheets (with PTP support) even claim that the MII data path is not free of buffering and some jitter can come from there.
The PTP packets actually contain timestamps stored in a data type that allows for deep sub-nanosecond resolution. But the "sub-nanosecond fractional field" is nowadays typically unused. AFAIR only the White Rabbit project (related to CERN the Swiss research center) has implemented sub-ns precision so far.
PTP is also available in pure software, without HW acceleration. In that case, for a SW-based GM and a SW-based client, expect to get a similar residual jitter as with NTP - i.e. about 50 us on a dedicated but PTP-unaware LAN. I recall getting sub-microsecond precision from a HW grandmaster on a direct interconnect (no switch in-between) and a SW-only client (on a PTP unaware PC NIC). Compared to NTP, the PTP's servo converges much faster.
While doing some "homework", it recently occurred to me that transporting PPS or similar "discrete" timing signals over wide-area fiber optic routes may be susceptible to temperature-dependent propagation time "wander". And although I have no way to test this experimentally, some sources in the interwebs quote figures between 40 and 76 picoseconds per km and Kelvin. Note that while this kind of "thermal wander" is impossible to mitigate "in band" in simplex PPS transmission, PTP would post-compensate this inherently, based on its standard path delay measurements (which depends on full duplex transmission).
So much for an overview of what the "precisions" look like, at different timing technologies / interfaces. What level of precision is good enough for you, that depends on your application, on your actual needs.
---- Update in 2020: ----
A colleague has recently demonstrated to me, that Chrony can be made to run NTP on well behaved PC hardware (with no special treatment to its bus clock oscillators) with a residual end-to-end jitter of a microsecond or less. This can be observed under the following conditions:
- systems idle
- network idle
- poll interval configured for 1 s (minpoll=maxpoll=0 , I believe)
- Chrony can also make use of HW timestamping if available
There's a Gentoo wikipage claiming that the hardware timestamping is dependent on co-existence with ptp4l, which I doubt (not suggested by Chrony's own documentation) - though it does make sense to me, that the PHC in a NIC should get somehow disciplined for HW timestamping to make good sense... not sure if Chrony can benefit from a PHC with a free-running clock. Apart from ptp4l, I can hack an i210 NIC to have its 25MHz clock PLL'ed to a precise frequency reference and its PPS bolted to a precise PPS reference. I haven't yet tried running Chrony on top of that though :-)
Another interesting observation: standard PTP with G.8275.2 Telecom Profile (i.e. not White Rabbit) can achieve low double-digit nanoseconds of "wander" over a modern MPLS VPN that's not congested and gives priority to PTP traffic. That, with the MPLS switches unaware of PTP (no on-path support) and over ~1000 km of distance... Measured as MTIE between PPS signals = the ultimate net result, and the oscillator in the slave was a high-end double-oven OCXO. Yes this practically means ideal conditions, and the real world tends to be a cruel place. Those figures are not something you should take for granted. Just demonstrating the potential. Also note that the immediate protocol jitter reported by the PTP slave at runtime is much worse than the net deviation measured at the oscillator's 1PPS output. And, over those distances / hop counts, you will get a hefty constant offset (path asymmetry) that needs to get calibrated / subtracted for the extracted PPS signal to be any use. And the asymmetry will change upon topology changes (path backup routes kicking into action)...