Rsync -avzHP follows hardlinks instead of copying them as hardlinks

I use rsnapshot to create hourly/daily/weekly/monthly backups of my "work"-share. Now I'm trying to copy the whole backup-directory onto an external drive using rsync.

I used this command/parameters within a screen session (yes, the rsync-exclude.txt lies in the dir I run the command from)

rsync -avzHP --exclude-from 'rsync-exclude.txt' /share/backup/ /share/eSATADisk1/backup/;

The whole thing is running on a QNAP TS-439, the internal drive is a single disk (no RAID) formated EXT4, the external drive is formated EXT3.

What happens is: Rsync follows every hardlink and copies the actual file instead of recreating the updated hardlink on the external drive. I didn't recognize this right away so the external drive ended up trashed with xxx copies of the same files.

What I want to achieve is: Copying the whole file structure generated by rsnapshot to the external drive keeping the hardlinks to save space. Note: This must not necessarily been done using rsync.

Thanks for your ideas and time. I'd appreciate your help, big time.

Update: I learned, that rsnapshot isn't using symlinks, it's using hardlinks so I now use the -H option which should preserve the hardlink structure acording to Rsnapshot to multiple destinations (or maintain hard links structure) but it still won't work... what am I missing here?

Update 2: I found another opinion/statement on this topic here: rsync with --hard-links freezes Steven Monday suggests not trying to rsync big file structures containing hardlinks, since it soaks up a lot memory an is a hard task for rsync. So probably a better solution would be making an .img of the data structure I'm trying to backup. What do you think?


Solution 1:

The rsync command's -H (or --hard-links) option will, in theory, do what you are trying to accomplish, which is, in brief: to create a copy of your filesystem that preserves the hard linked structure of the original. As I mentioned in my answer to another similar question, this option is doomed to fail once your source filesystem grows beyond a certain threshold of hard link complexity.

The precise location of that threshold may depend on your RAM and the total number of hard links (and probably a number of other things), but I have found that there's no point in trying to define it precisely. What really matters is that the threshold is all-too-easy to cross in real-world situations, and you won't know that you have crossed it, until the day comes that you try to run an rsync -aH or a cp -a that struggles and eventually fails.

What I recommend is this: Copy your heavily hard linked filesystem as one unit, not as files. That is, copy the entire filesystem partition as one big blob. There are a number of tools available to do this, but the most ubiquitous is dd.

With stock firmware, your QNAP NAS should have dd built in, as well as fdisk. With fdisk, create a partition on the destination drive that is at least as large as the source partition. Then, use dd to create an exact copy of your source partition on the newly created destination partition.

While the dd copy is in progress, you must ensure that nothing changes in the source filesystem, lest you end up with a corrupted copy on the destination. One way to do that is to umount the source before starting the copying process; another way is to mount the source in read-only mode.

Solution 2:

-l is for symlinks, why would it do anything for hardlinks?

(Sorry this is an answer and not a comment, I don't have comment rights yet and this answer needed a response)

Another note that should be a comment: is this all native hardware or are you on a VM, network mount?

Edit

ignore my earlier comment regarding why you are using hardlinks, I missed the rsnapshot comment.

It would be helpful to have a test that first tests rsync between two local directories local disk, then against your remote disk. This little test shows the -H option wokrs as expected. The -i option for ls shows the inodes, thus showing that the links have been preserved, with no extra copies.

$ rsync -avzHP src/ dest
sending incremental file list
created directory dest
./
file111_prime.txt
           9 100%    0.00kB/s    0:00:00 (xfer#1, to-check=0/3)
file111.txt => file111_prime.txt

sent 156 bytes  received 59 bytes  430.00 bytes/sec
total size is 18  speedup is 0.08

$ ls -liR
.:
total 8
414044 drwxrwxr-x. 2 nhed nhed 4096 Feb 25 09:58 dest
414031 drwxrwxr-x. 2 nhed nhed 4096 Feb 25 09:58 src

./dest:
total 8
414046 -rw-rw-r--. 2 nhed nhed 9 Feb 25 09:57 file111_prime.txt
414046 -rw-rw-r--. 2 nhed nhed 9 Feb 25 09:57 file111.txt

./src:
total 8
414032 -rw-rw-r--. 2 nhed nhed 9 Feb 25 09:57 file111_prime.txt
414032 -rw-rw-r--. 2 nhed nhed 9 Feb 25 09:57 file111.txt

A subsequent test rsync -avzHP src/ host:/tmp to a remote host still maintained the hardlinks