Has anyone achieved true differential sync with rsync in ESXi?

Berate me later on the fact that I'm using the service console to do anything in ESXi...

I've got a working rsync binary (v3.0.4) that I can use in ESXi 4.1U1. I tend to use rsync over cp when copying VM's or backups from one local datastore to another local datastore. I've used rsync to copy data from one ESXi box to another but that was just for small files.

In now trying to do true differential syncs of backups taken via ghettoVCB between my primary ESXi machine and a secondary one. But even when I do this locally (one datastore to another datastore on the same ESXi machine) rsync appears to copy the files in their entirety. I've got two VMDK's totally 80GB in size, and rsync still takes anywhere between 1 and 2 hours but the VMDK's aren't growing that much daily.

Below is the rsync command I'm executing. I am copying locally because ultimately these files will get copied onto a datastore created from a LUN on a remote system. Its not an rsync that'll be serviced by an rsync daemon on a remote system.

rsync -avPSI VMBACKUP_2011-06-10_02-27-56/* VMBACKUP_2011-06-01_06-37-11/ --stats --itemize-changes --existing --modify-window=2 --no-whole-file
sending incremental file list
>f..t...... VM-flat.vmdk
 42949672960 100%   15.06MB/s    0:45:20 (xfer#1, to-check=5/6)
>f..t...... VM.vmdk
         556 100%    4.24kB/s    0:00:00 (xfer#2, to-check=4/6)
>f..t...... VM.vmx
        3327 100%   25.19kB/s    0:00:00 (xfer#3, to-check=3/6)
>f..t...... VM_1-flat.vmdk
 42949672960 100%   12.19MB/s    0:56:01 (xfer#4, to-check=2/6)
>f..t...... VM_1.vmdk
         558 100%    2.51kB/s    0:00:00 (xfer#5, to-check=1/6)
>f..t...... STATUS.ok
          30 100%    0.02kB/s    0:00:01 (xfer#6, to-check=0/6)

Number of files: 6
Number of files transferred: 6
Total file size: 85899350391 bytes
Total transferred file size: 85899350391 bytes
Literal data: 2429682778 bytes
Matched data: 83469667613 bytes
File list size: 129
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 2432530094
Total bytes received: 5243054

sent 2432530094 bytes  received 5243054 bytes  295648.92 bytes/sec
total size is 85899350391  speedup is 35.24

Is this because ESXi is itself making so many changes to the VMDK's that as far as rsync is concerned the entire file has to be retransmitted?

Has anyone actually achieved actual diff sync with ESXi?


Solution 1:

It looks like you've only transferred 2GB of incremental changes. Remember that rsync still has to read in one whole file and checksum it so it has to read 80GB of data. Check your server stats during the rsync. Are you cpu or IO bound during the operation? How fast can you read the 80GB file off disk? That will be near your absolute minimum transfer time.

Also of note is that rsync makes a copy of the file while transferring then moves the final file into place in an atomic operation. You can see this by seeing a similiar filename with a random suffix during the transfer in the destination directory. This means that you have to read 160GB of data (80GB each for each source and destination) and write out 80GB on the destination side. Have you looked at the --inplace option? It may be beneficial here.

In short you may only have 2GB of changes but rsync is doing LOTS of work. You're probably IO bound as all that reading and writing on the same disk would could cause lots of contention and slowdown.

Solution 2:

This Thread is very old, but may it help's someone.

As ESX is locking the filesystem on every write of new blocks, the performance isn't that great, with the Option --inplace you may get better results, but be aware, if you cancel the sync, the file will not be consistent any more. About consistency, rsync of an open file could be inconsistent any way -> better use snapshot before rsync.

Best Regards Marc