Rsync friendly gzip
I must not be the only one - I'm rsyncing .tar.gz files and notice that every time the full file gets rsynced rather than the differences. Reading into it it seems back in 1999 someone created an algorithm that fixed the issue http://svana.org/kleptog/rgzip.html (only 5% of data needed transferred)
Has this gone anywhere since, how do I create rsync friendly .tar.gz files?
Solution 1:
my gzip (on ubuntu and fedora) has the --rsyncable option. So create the tarballs using:
tar -c whatever/ | gzip --rsyncable > file.tar.gz
Solution 2:
BeezNest has a pretty good explanation of the rsyncable option for gzip. In the author's test, this option added about 1% to the file size, but made it possible for rsync to transfer the updates to a gzipped file with over 1,300 times speedup.
For the gory details, see this discussion (specifically, section 4.4.2), which they cite. The gist of it is:
The modification is quite simple:
- A fast rolling signature is computed for a small window around the current point in the uncompressed file;
- stream compression progresses as usual;
- when the rolling signature equals a pre-determined value the compression tables are reset and a token is emitted indicating the start of a new compression region.