How can I check the actual size used in an NTFS directory with many hardlinks?

Solution 1:

Try using Sysinternals Disk Usage (otherwise know as du), specifically using the -u and -v flags will only count unique occurrences, and will show the usage of each folder as it goes along.

As far as I know the file system doesn't show the difference between the original file and a hard link (that is really the point of a hard link) so you can't discount them on a folder-by-folder basis, but need to do this comparatively.

To test I created a random folder with 6 files in to. Cloned the whole thing. Then created several hard and soft links inside the first folder to reference other files in the first folder, and also some in the second.

Running du -u -v testFld results in (note the values next to the folders are in KiB):

       104  <path>\testFld\A
        54  <path>\testFld\B
       149  <path>\testFld

Totals:
Files:        12
Directories:  2
Size:         162,794 bytes
Size on disk: 162,794 bytes

Running du -u -v testFld\a results in:

104  <path>\testFld\a
...

Running du -u -v testFld\b results in:

74   <path>\testFld\b
...

Notice the mismatch?
The symlinks in A that refer to files in B are only counted against A during the "full" run, and B only returns 54 (even though the files were originally in B and hard-linked from A). When you measure B seperately (or, if you don't use the -u unique flag) it will count its "full" measure of 74.

Solution 2:

PowerShell 5 may be an option. It is available for Windows 7 but I only tested this on a Server 2012 R2 with the April 2015 Preview

The filesystem provider in PowerShell 5 has two new properties LinkType and Target:

ls taskmgr.exe | fl LinkType,Target

this returns:

LinkType : HardLink
Target   : C:\Windows\WinSxS\amd64_microsoft-windows-advancedtaskmanager_..._6.3.9600.17..2\Taskmgr.exe

So now I can only show all files in system32 that are not hardlinks:

cd $env:SystemRoot\System32
ls -Recurse -File -force -ErrorAction SilentlyContinue | ? LinkType -ne HardLink | Measure-Object -Property Length -Sum

this returns:

Count    : 844
Sum      : 502,486,831

you can compare that with all files:

ls -Recurse -File -force -ErrorAction SilentlyContinue | Measure-Object -Property Length -Sum

Count    : 14092
Sum      : 2,538,256,262

So over 13,000 files with 2GB+ are hardlinks