Linux disk usage analyser that acts like symlinks are real files
Solution 1:
GNU du has the --dereference option, which dereferences symbolic links when computing disk usage. However, du refuses to count the same space twice, which may be a deal-breaker in your situation:
% mkdir foo bar baz
% dd if=/dev/zero of=foo/test bs=1024 count=10000
10000+0 records in
10000+0 records out
10240000 bytes (10 MB) copied, 0.0176239 s, 581 MB/s
% (cd bar; ln -s ../foo/test)
% (cd baz; ln -s ../foo/test)
% du -hc bar baz
4.0K bar
4.0K baz
8.0K total
% du -hc --dereference bar baz
9.8M bar
4.0K baz
9.8M total
If you don't have multiple symlinks to the same target, though, I think --dereference does what you want.
Solution 2:
nowadays, git-annex has its own solutions for this problem. you can use:
git annex info --fast *
...to get actual disk usage (and more) from the files directly from git-annex. it can also operate on remote repositories, which is very useful:
git annex info --fast --not --in here .
... would give you the amount of data that is not in the current repository for example.
i have also used ncdu with this small patch with good results.
the upstream forum discussing this is "du" equivalent on an annex? and has more suggestions, like du -L
, gadu and sizes that were mentionned in other answers here.
Solution 3:
git-annex has a list of related software including some git-annex aware disk usage tools - gadu and sizes.