Sluggish performance on NTFS drive with large number of files
I am looking at this setup:
- Windows Server 2012
- 1 TB NTFS drive, 4 KB clusters, ~90% full
- ~10M files stored in 10,000 folders = ~1,000 files/folder
- Files mostly quite small < 50 KB
- Virtual drive hosted on disk array
When an application accesses files stored in random folders it takes 60-100 ms to read each file. With a test tool it seems that the delay occurs when opening the file. Reading the data then only takes a fraction of the time.
In summary this means that reading 50 files can easily take 3-4 seconds which is much more than expected. Writing is done in batch so performance is not an issue here.
I already followed advice on SO and SF to arrive at these numbers.
- Using folders to reduce number of files per folder (Storing a million images in the filesystem)
- Run
contig
to defragment folders and files (https://stackoverflow.com/a/291292/1059776) - 8.3 names and last access time disabled (Configuring NTFS file system for performance)
What to do about the read times?
- Consider 60-100 ms per file to be ok (it isn't, is it?)
- Any ideas how the setup can be improved?
- Are there low-level monitoring tools that can tell what exactly the time is spent on?
UPDATE
As mentioned in the comments the system runs Symantec Endpoint Protection. However, disabling it does not change the read times.
PerfMon measures 10-20 ms per read. This would mean that any file read takes ~6 I/O read operations, right? Would this be MFT lookup and ACL checks?
The MFT has a size of ~8.5 GB which is more than main memory.
The server did not have enough memory. Instead of caching NTFS metafile data in memory every file access required multiple disk reads. As usual, the issue is obvious once you see it. Let me share what clouded my perspective:
The server showed 2 GB memory available both in Task Manager and RamMap. So either Windows decided that the available memory was not enough to hold a meaningful part of the metafile data. Or some internal restriction does not allow to use the last bit of memory for metafile data.
After upgrading the RAM Task Manager would not show more memory being used. However, RamMap reported multiple GB of metafile data being held as standby data. Apparently, the standby data can have a substantial impact.
Tools used for the analysis:
-
fsutil fsinfo ntfsinfo driveletter:
to show NTFS MFT size (or NTFSInfo) - RamMap to show memory allocation
- Process Monitor to show that every file read is preceded by about 4 read operations to drive:\$Mft and drive:\$Directory. Though I could not find the exact definition of $Directory it seems to be related to the MFT as well.