Why do two directory hierarchies that are in sync have different sizes?

I'm using rsync to sync two folders

rsync -arzv --times --delete-after --relative -e ssh user@host:path/./media/ ~/path/

and it says everything is good, but the destination reports:

$ du -s path/media/
18335196    site_media/media/

and the source reports:

$ du -s path/media/
18473500        site_media/media/

When I dig down into the problem, all the files are the same size, but the directories differ in size. Why? Both are VM's running ubuntu, the source is on 11.04 and the destination is on 12.04 LTS

I understand why they don't add up to the same numbers, what I'd like to understand is why the folders report different sizes.


Since it's two different VMs running different major versions of Ubuntu I'd suspect block size of the filesystem is the culprit. du reports how much of the disk is being used, not the sum of the file sizes. A subtle, yet important distinction.

If you have a file that is 1 byte in size and your block size is 1KB then du will report 1KB as used. If the block size is 4KB then it will report 4KB used. If that file is 1025B then it would report 2KB used for the 1KB block size and 4KB for the 4KB. And if the file is 4097B then it will be reported as 5KB on the 1KB block size and 8KB on the 4KB block size.

This sequence demonstrates this behavior:

$ touch foo ; du -h foo
  0B    foo
$ echo -n 1 > foo ; du -h foo
4.0K    foo

Use this command to show the block size of your filesystems:

tune2fs -l /dev/sda1 | grep -i 'block size'

(Obviously, replace /dev/sda1 with the appropriate block device.)

If it's different, there's your discrepancy.

A better way to check for the exactness of the rsync is to hash your files and compare. Here's an example:

find path/media -exec openssl sha1 {} + | sort > ~/hashes

Then diff the hashes files.


There are many sources of differences when using du. Check man for reference. I have been facing such problem on aix too. In manual, there is an option --apparent-size, which describes these differences quite well. Also - mind the block size for which the size is calculated by du (default is 1024 bytes, but may vary depending on system). You will have to cope with it using a command which shows exact size of files (ls or find), which was the way, I've used to solve this.