How do you set bzip2 block size when using tar?

I am using tar to backup a linux server to tape. I am using the -j option to compress the file with bzip2, however I can't see a way to adjust the block size options for bzip2 from tar. The default block size is 900,000 bytes which gives the best compression but is the slowest. I am not that bothered about the compression ratio, so am looking to make bzip2 run faster with a smaller block size.


Solution 1:

export BZIP=--fast
tar cjf foo.tar.bz2 foo

Or pipe the output of tar to bzip2.

Though you should note from the bzip2 man page:

    -1 (or --fast) to -9 (or --best)
              Set  the  block size to 100 k, 200 k ..  900 k when compressing.
              Has no effect when decompressing.  See MEMORY MANAGEMENT  below.
              The --fast and --best aliases are primarily for GNU gzip compat-
              ibility.  In particular, --fast  doesn't  make  things  signifi-
              cantly faster.  And --best merely selects the default behaviour.

Solution 2:

tar -cjf dir.tar.bz2 --options bzip2:compression-level=9 path/to/dir/

Solution 3:

bzip2 block sizes

bzip2 has some block size options. From the manual page bzip2(1):

-1 (or --fast) to -9 (or --best)
       Set the block size to 100 k, 200 k ..  900 k when compressing.
       Has no effect when decompressing. See MEMORY MANAGEMENT below.
       The --fast and --best aliases are primarily for GNU gzip
       compatibility. In particular, --fast doesn't make things
       significantly faster. And --best merely selects the default
       behaviour.

As you want faster compression with less regards to compression ratio, using bzip2, you seem to want the -1 (or --fast) option.

Setting bzip2 block size when using tar

You can set bzip2 block size when using tar in a couple of ways.

The UNlX way

My favorite way, the UNlX way, is one where you use every tool independently, and combine them through pipes.

$ tar --create [FILE...] | bzip2 -1 > [ARCHIVE].tar.bz2

You can read that as "create .tar with tar -> bzip it with bzip2 -> write it to [ARCHIVE].tar.bz2".

Environment variable

It is also possible to set bzip2 options through the environment variable BZIP2. From the manual page bzip2(1):

bzip2 will read arguments from the environment variables BZIP2 and BZIP,
in that order, and will process them before any arguments read from the
command line. This gives a convenient way to supply default arguments.

So to use that with tar, you could for example do:

$ BZIP2=-1 tar --create --bzip2 --file [ARCHIVE].tar.bz2 [FILE...]

Faster alternatives

bzip2 uses a slow compression algorithm. If you are concerned about speed, you could investigate alternative algorithms, such as those used by gzip or lzop. Here is a nice article comparing compression tools: https://aliver.wordpress.com/2010/06/22/huge-unix-file-compresser-shootout-with-tons-of-datagraphs/

Solution 4:

Send the tar output to stdout and then pipe it through bzip2 separately:

% tar cvf - _file_ | bzip2 _opts_ > output.tar.bz2