On x86-64, is the “movnti” or "movntdq" instruction atomic when system crash?

When using persistent memory like Intel optane DCPMM, is it possible to see partial result after reboot if system crash(power outage) in execution of movnt instruction?

For:

  • 4 or 8 byte movnti which x86 guarantees atomic for other purposes?
  • 16-byte SSE movntdq / movntps which aren't guaranteed atomic but which in practice probably are on CPUs supporting persistent memory.
  • 32-byte AVX vmovntdq / vmovntps
  • 64-byte AVX512 vmovntdq / vmovntps full-line stores
  • bonus question: MOVDIR64B which has guaranteed 64-byte write atomicity, on future CPUs that support it and DC-PM. e.g. Sapphire Rapids Xeon / Tiger Lake / Tremont.

movntpd is assumed to be identical to movntps.


Related questions:

  • On x86-64, is the "movnti" instruction atomic? yes
  • Is clflush or clflushopt atomic when system crash?

Solution 1:

The following operations are guaranteed to be persistently atomic:

  • A store uop that doesn't cross an 8-byte boundary to a location of any effective memory type, and
  • MOVDIR64B.

Note all atomic guarantees mentioned in the Intel SDM V3 Section 8.1.1 apply to persistent memory.

In addition, the following operations are persistently atomic:

  • A cache line flush (CLFLUSH or CLFLUSHOPT),
  • A cache line writeback (CLWB), and
  • A non-architectural cache line eviction.
  • A full write-combining buffer flush on Intel processors. The presence and size of WCBs and the causes of flush are implementation-specific. See: (Persistence) ordering of Intel non-temporal stores to the same cache line.

There is no architectural persistent atomicity guarantee for everything else, including 64-byte AVX512 vmovntdq / vmovntps full-line stores.

These guarantees apply to Asynchronous DRAM Refresh (ADR) platforms and Enhanced Asynchronous DRAM Refresh (eADR) platforms. (On eADR, the cache hierarchy is in the persistence domain. See: Build Persistent Memory Applications with Reliability Availability and Serviceability.)

This answer is based on my private correspondence with Andy Rudoff (Intel).