How does rsync know which files to transfer?
This is just a "I feel curious" question. When syncing locally (that is, from one drive to another on the same host), how does rsync know which files are not worth transmitting? I guess it does not do a full file compare (since that would be too expensive). Does it only do simple checking, like:
- file size
- modification time
If so, I guess it would be easy to fool rsync by changing the contents of the file, keeping the size, and resetting the modification time (if at all possible).
This is popular misconception - delta transfer with rsync
. rsync
was designed to do fastest copy, not necessary a delta copy. And to do delta copy you need to do checksums. Now, 'rsync' will NOT check which files to transmit but which files can be skipped. Usually will check on sender size/ mtime and compare with receiving end.
With rsync
you have an option to do checksum on each file (I think that it's --checksum
).Of course this will be slow...
Rsync doesn't do 'incremental', it's more like 'differential'. it doesn't transfer changes (which assume some knowledge of a prior run), it transfers differences (by comparing the source with the target files)
- first checks file size, creation/modification dates, flags... if it's all identical, skip the file.
- if there's no file with that name on the target, simply copies the whole file.
- if there's a file on the target, it calculates checksums for each 2KB of the file and transfers to the sender.
for more info, Please Have a look on Rsync Man pages.
https://linux.die.net/man/1/rsync