What's the best way to use parallel bzip2 and gzip by default?

Bzip2 and gzip only use one core, although many computers have more than one core. But there are programs like lbzip2, pbzip2 and pigz, which use all available cores and promise to be compatible with bzip2 and gzip.

So what's the best way to use these programs by default, so that tar cfa file.tar.bz2 directory uses lbzip2/pbzip2 instead of bzip2? Of course I don't want to break anything.


Solution 1:

The symlink idea is really fine.
Another working solution is to alias tar:

alias tar='tar --use-compress-program=pbzip2'

or respectively

alias tar='tar --use-compress-program=pigz'

It creates another kind of default.

Solution 2:

You can symlink bzip2, bunzip2 and bzcat to lbzip2, and gzip, gunzip, gzcat and zcat to pigz:

sudo apt-get install lbzip2 pigz
cd /usr/local/bin
ln -s /usr/bin/lbzip2 bzip2
ln -s /usr/bin/lbzip2 bunzip2
ln -s /usr/bin/lbzip2 bzcat
ln -s /usr/bin/pigz gzip
ln -s /usr/bin/pigz gunzip
ln -s /usr/bin/pigz gzcat
ln -s /usr/bin/pigz zcat

I chose lbzip2 instead of pbzip2 because the /usr/share/doc/lbzip2/README.gz looks "nicer" than /usr/share/doc/pbzip2/README.gz. Also, the tar manual talks about lbzip2.

Edit:

pigz-2.1.6, which is included in Precise Pangolin, refuses to decompress files with unknown suffixes (e.g. initramfs-*.img). This is fixed in pigz-2.2.4, which ships with Quantal. So you might want to wait until Quantal, install the Quantal package manually, or don't link gunzip/gzcat/zcat yet.

Solution 3:

The symlink answer is really incorrect. It would replace the default gzip (or bzip2) with pigz (or pbzip2) for the entire system. While the parallel implementations are remarkably similar to the single process versions, subtle differences in command line options could break core system processes who depend on those differences.

The --use-compress-program option is a much better choice.

A second option (much like the alias) would be to set the TAR_OPTIONS environment variable supported by GNU tar:

export TAR_OPTIONS="--use-compress-program=pbzip2"
tar czf myfile.tar.bz2 mysubdir/

Solution 4:

One fascinating option is to recompile tar to use multithreaded by default. Copied from this stackoverflow answer

Recompiling with replacement

If you build tar from sources, then you can recompile with parameters

--with-gzip=pigz
--with-bzip2=lbzip2
--with-lzip=plzip

After recompiling tar with these options you can check the output of tar's help:

$ tar --help | grep "lbzip2\|plzip\|pigz"
  -j, --bzip2                filter the archive through lbzip2
      --lzip                 filter the archive through plzip
  -z, --gzip, --gunzip, --ungzip   filter the archive through pigz