Best compression for ZFS send/recv

I'm sending incremental ZFS snapshots over a point-to-point T1 line and we're to a point where a day's worth of snapshots can barely make it over the wire before the next backup starts. Our send/recv command is:

zfs send -i tank/vm@2009-10-10 tank/vm@2009-10-12 | bzip2 -c | \
ssh offsite-backup "bzcat | zfs recv -F tank/vm"

I have plenty of CPU cycles to spare. Is there a better compression algorithm or alternative method I can use to push less data over the line?


Solution 1:

It sounds like you've tried all of the best compression mechanisms and are still being limited by the line speed. Assuming running a faster line is out of the question, have you considered just running the backups less frequently so that they have more time to run?

Short of that, is there some kind of way to lower the amount of data being written? Without knowing your application stack its hard to say how, but just doing things like making sure apps are overwriting existing files instead of creating new ones might help. And making sure you arent saving backups of temp/cache files that you wont need.

Solution 2:

Things have changed in the years since this question was posted:

1: ZFS now supports compressed replication, just add the -c flag to the zfs send command, and blocks what were compressed on disk will remain compressed as they pass through the pipe to the other end. There may still be more compression to be gained, because the default compression in ZFS is lz4

2: The best compressor to use in this case is zstd (ZStandard), it now has an 'adaptive' mode that will change the compression level (between the 19+ levels supported, plus the new higher speed zstd-fast levels) based on the speed of the link between zfs send and zfs recv. It compresses as much as it can while keeping the queue of data waiting to go out the pipe to a minimum. If your link is fast it won't waste time compressing the data more, and if your link is slow, it will keep working to compress the data more and save you time in the end. It also supports threaded compression, so I can take advantage of multiple cores, which gzip and bzip do not, outside of special versions like pigzip.

Solution 3:

Here is what I've learned doing the exact same thing you are doing. I suggest using mbuffer. When testing in my environment it only helped on the receiving end, without it at all the send would slow down while the receive caught up.

Some examples: http://everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/

Homepage with options and syntax http://www.maier-komor.de/mbuffer.html

The send command from my replication script:

zfs send -i tank/pool@oldsnap tank/pool@newsnap | ssh -c arcfour remotehostip "mbuffer -s 128k -m 1G | zfs receive -F tank/pool"

this runs mbuffer on the remote host as a receive buffer so the sending runs as fast as possible. I run a 20mbit line and found that having mbuffer on the sending side as well didn't help, also my main zfs box is using all of it's ram as cache so giving even 1g to mbuffer would require me to reduce some cache sizes.

Also, and this isnt really my area of expertise, I think it's best to just let ssh do the compression. In your example I think you are using bzip and then using ssh which by default uses compression, so SSH is trying to compress a compressed stream. I ended up using arcfour as the cipher as it's the least CPU intensive and that was important for me. You may have better results with another cipher, but I'd definately suggest letting SSH do the compression (or turn off ssh compression if you really want to use something it doesn't support).

Whats really interesting is that using mbuffer when sending and receiving on localhost speeds things up as well:

zfs send tank/pool@snapshot | mbuffer -s 128k -m 4G -o - | zfs receive -F tank2/pool

I found that 4g for localhost transfers seems to be the sweetspot for me. It just goes to show that zfs send/receive doesn't really like latency or any other pauses in the stream to work best.

Just my experience, hope this helps. It took me awhile to figure all this out.