Find the bottleneck: disk I/O on Windows XP

One of our development boxes has developed a problem wherein performance will occasionally drop through the floor. When this happens, you can hear the hard drive thrashing, but I don't know what's causing it.

This happens during periods of high disk access (reading/writing multi-gigabyte files), but not every time nor for the entire period of disk access. Those files are also kept rigorously defragmented specifically to prevent the kind of "seek thrashing" that seems to be occurring.

I suspect that the problem lies either with the system's antivirus or with some disk-indexing service I don't know about (AFAIK, there aren't any running, but…). Unfortunately, my Performance Monitor-fu is very, very weak (okay, nearly non-existent), and I don't know how to confirm/disprove my suspicions or find out what the real culprit is.

Update:

Process Explorer located the culprits for me — the Java Quick Starter and Windows Search services. Turning off the former had a noticeable impact on performance and turning off the latter had an enormous one (despite having not been given any files to access). Both were performing 5-20 times as much disk access as any other process.

Thanks all for your help!


Download Process Explorer (sysinternals/microsoft).

Under View, Select Columns, Process Performance, choose e.g. IO Write Bytes, IO Read Bytes.

You can click on those columns to sort.


SysInternals has a couple of tools that might help here. Their PerfMon tool will give you (exhaustive) details about what's accessing what. If it really is an AV tool doing a scan, it should show up there impressively obviously. If it is a background Windows task, things get a bit more complicated, but those accesses do show up in there as well. Even if it is less obvious which service is responsible.


I have use Sysinternals FileMon successfully to find out what program is doing much I/O and with which files. For example when I switched to an SSD that has poor random write performance (4 IOPS, OCZ Core v1), FileMon told me which programs were doing writing, and I could move those files from the SSD to another HDD. It also helped me to find out that updating of last access timestamps was killing the performance (when Locate32 indexes all my disks once a day), so I was able to disable last access timestamps.


To verify a disk I/O issue, you will want to start by monitoring the Logical Disk/Current Disk Queue Length on each drive in Performance Monitor. This should generally stay at zero unless you have multiple processes accessing the disk. Excessive I/O or poor disk performance will increase the queue length.

Once you verify an I/O issue, use something like FileMon or Process Monitor for Microsoft (Sysinternals) to see what process is causing the activity.

If you do not see a process corresponding to the I/O then it may be a page file/memory issue. Go back to Performance Monitor and add the Memory/Pages per second. This will show how frequently a process has to go to disk to page memory in to RAM. If this is holding above zero, then you need more memory or an application has a memory leak. To find the memory leak, use the Process/Pages per second to see which process is forcing the pages.