Fastest GZIP utility
I'm looking for the fastest gzip
(or zip) utility. I have a LVM volume which 95% exists out of blank 0
's, so compressing that is very easy. I'm looking for the most fastest solution, and don't really care of the compression except the 0
's.
I'm aware of gzip -1
(same as gzip --fast
) but was wondering if there's any faster method.
Thanks.
Edit:
after some tests, I compared gzip -1
, lzop -1
and pigz -1
with eachother and came to the following results:
PIGZ:
time dd if=/dev/VPS/snap | pigz -1 | ssh backup-server "dd of=/home/backupvps/snap.pigz"
104857600+0 records in
104857600+0 records out
53687091200 bytes (54 GB) copied, 2086.87 seconds, 25.7 MB/s
7093985+266013 records in
7163950+1 records out
3667942715 bytes (3.7 GB) copied, 2085.75 seconds, 1.8 MB/s
real 34m47.147s
LZOP:
time dd if=/dev/VPS/snap | lzop -1 | ssh backup-server "dd of=/home/backupvps/snap.lzop"
104857600+0 records in
104857600+0 records out
53687091200 bytes (54 GB) copied, 1829.31 seconds, 29.3 MB/s
7914243+311979 records in
7937728+1 records out
4064117245 bytes (4.1 GB) copied, 1828.08 seconds, 2.2 MB/s
real 30m29.430s
GZIP:
time dd if=/dev/VPS/snap | gzip -1 | ssh backup-server "dd of=/home/backupvps/snap_gzip.img.gz
104857600+0 records in
104857600+0 records out
53687091200 bytes (54 GB) copied, 1843.61 seconds, 29.1 MB/s
7176193+42 records in
7176214+1 records out
3674221747 bytes (3.7 GB) copied, 1842.09 seconds, 2.0 MB/s
real 30m43.846s
Edit 2:
This is somewhat unrelated to my initial question, however using time dd if=/dev/VPS/snap | lzop -1 | ssh backup-server "dd of=/home/backupvps/snap.lzop"
(block size changed to 16M) the time is reduced to real 18m22.442s
!
If you don't mind stepping away from DEFLATE, lzop
is an implementation of LZO that favors speed over compression ratio.
Although I personally have not yet used it, I think using parallel gzip could speed up things a bit:
pigz, which stands for parallel implementation of gzip, is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.
You can try Parallel Gzip (Pascal linked it in), or Parallel BZIP.
In theory, BZIP is much better for text, so you may want to try pbzip.
Your disk is limited at 30MB/s
All compressors do well enough. You can even reduce network transfer using slighty slower but omnipresent bzip2.
$dd if=/dev/zero bs=2M count=512 | pigz -1 | dd > /dev/null
512+0 records in
512+0 records out
1073741824 bytes (1.1 GB) copied, 9.12679 s, 118 MB/s
8192+7909 records in
9488+1 records out
4857870 bytes (4.9 MB) copied, 9.13024 s, 532 kB/s
$dd if=/dev/zero bs=2M count=512 | bzip2 -1 | dd > /dev/null
512+0 records in
512+0 records out
1073741824 bytes (1.1 GB) copied, 37.4471 s, 28.7 MB/s
12+1 records in
12+1 records out
6533 bytes (6.5 kB) copied, 37.4981 s, 0.2 kB/s
$dd if=/dev/zero bs=2M count=512 | gzip -1 | dd > /dev/null
512+0 records in
512+0 records out
1073741824 bytes (1.1 GB) copied, 14.305 s, 75.1 MB/s
9147+1 records in
9147+1 records out
4683762 bytes (4.7 MB) copied, 14.3048 s, 327 kB/s
Have you considered rsync? It does checksumming and then gzipping the difference only.