Copy or rsync command

The following command is working as expected...

cp -ur /home/abc/* /mnt/windowsabc/

Does rsync has any advantage over it? Is there a better way to keep to backup folder in sync every 24 hours?


Solution 1:

Rsync is better since it will only copy only the updated parts of the updated file, instead of the whole file. It also uses compression and encryption if you want. Check out this tutorial.

Solution 2:

rsync is not necessarily more efficient, due to the more detailed inventory of files and blocks it performs. The algorithm is fantastic at what it does, but you need to understand your problem to know if it is really going to be the best choice.

On a very large file system (say many thousands or millions of files) where files tend to be added but not updated, "cp -u" will likely be more efficient. cp makes the decision to copy solely on metadata and can simply get to the business of copying.

Note that you might want some buffering, e.g. by using tar rather than straight cp, depending on the size of the files, network performance, other disk activity, etc. I find the following idea very useful:

tar cf - . | tar xCf directory -

Metadata itself may actually become a significant overhead on very large (cluster) file systems, but rsync and cp will share this problem.

rsync seems to frequently be the preferred tool (and in general purpose applications is my usual default choice), but there are probably many people who blindly use rsync without thinking it through.

Solution 3:

The command as written will create new directories and files with the current date and time stamp, and yourself as the owner. If you are the only user on your system and you are doing this daily it may not matter much. But if preserving those attributes matters to you, you can modify your command with

cp -pur /home/abc/* /mnt/windowsabc/

The -p will preserve ownership, timestamps, and mode of the file. This can be pretty important depending on what you're backing up.

The alternative command with rsync would be

rsync -avh /home/abc/* /mnt/windowsabc

With rsync, -a indicates "archive" which preserves all those attributes mentioned above. -v indicates "verbose" which just lists what it's doing with each file as it runs. -z is left out here for local copies, but is for compression, which will help if you are backing up over a network. Finally, the -h tells rsync to report sizes in human-readable formats like MB,GB,etc.

Out of curiosity, I ran one copy to prime the system and avoid biasing against the first run, then I timed the following on a test run of 1GB of files from an internal SSD drive to a USB-connected HDD. These simply copied to empty target directories.

cp -pur    : 19.5 seconds
rsync -ah  : 19.6 seconds
rsync -azh : 61.5 seconds

Both commands seem to be about the same, although zipping and unzipping obviously tax the system where bandwidth is not a bottleneck.

Solution 4:

Especially if you use a copy-on-write filesystem like BTRFS or ZFS, rsync is much better.

I use BTRFS, and I have this in my ~/.bashrc:

alias cp="rsync -ah --inplace --no-whole-file --info=progress2"

The important flag here for CoW FSs like BTRFS is --inplace because it only copies the changed part of the files, doesn't create new for small changes between files inodes, etc. See this.