rsync with --hard-links freezes
Solution 1:
My answer, which I give from hard-earned experience, is: Don't do this. Don't try to copy a directory hierarchy that makes heavy use of hard links, such as one created using rsnapshot
or rsync --link-dest
or similar. It won't work on anything but small datasets. At least, not reliably. (Your mileage may vary, of course; perhaps your backup datasets are much smaller than mine were.)
The problem with using rsync --hard-links
to recreate the hard-linked structure of files on the destination side is that discovering the hard-links on the source side is hard. rsync
has to build a map of inodes in memory to find the hard-links, and unless your source has relatively few files, this can and will blow up. In my case, when I learned of this problem and was looking around for alternate solutions, I tried cp -a
, which is also supposed to preserve the hard-link structure of files in the destination. It churned away for a long time and then finally died (with a segfault, or something like that).
My recommendation is to set aside an entire partition for your rsnapshot
backup. When it fills up, bring another partition online. It is much easier to move around hard-link-heavy datasets as entire partitions, rather than as individual files.
Solution 2:
At the point rsync seems to hang, is it hung or just busy? Check for cpu activity with top
and disk activity with iotop -o
.
It could be busy copying over a large file. You would see this in iotop
or similar, or in rsync's display if you ran it with the --progress
option.
It could also be busy scanning through lists of inodes to check for linked files. If incremental recursion is being used, which is the default for recursive transfers in most cases if both client and server have rsync v3.0.0 or later, it could have just hit a directory with many files and be running the link check between all the files in it and all those found previously. The --hard-links
option can be very CPU intensive over large sets of files (this is why it is not included in the list of options implied by the general --archive
option). This will manifest itself as high CPU use at the time rsync seems paused/hung.