How do you set bzip2 block size when using tar?
I am using tar
to backup a linux server to tape. I am using the -j
option to compress the file with bzip2
, however I can't see a way to adjust the block size options for bzip2 from tar. The default block size is 900,000 bytes which gives the best compression but is the slowest. I am not that bothered about the compression ratio, so am looking to make bzip2 run faster with a smaller block size.
Solution 1:
export BZIP=--fast
tar cjf foo.tar.bz2 foo
Or pipe the output of tar
to bzip2
.
Though you should note from the bzip2 man page:
-1 (or --fast) to -9 (or --best) Set the block size to 100 k, 200 k .. 900 k when compressing. Has no effect when decompressing. See MEMORY MANAGEMENT below. The --fast and --best aliases are primarily for GNU gzip compat- ibility. In particular, --fast doesn't make things signifi- cantly faster. And --best merely selects the default behaviour.
Solution 2:
tar -cjf dir.tar.bz2 --options bzip2:compression-level=9 path/to/dir/
Solution 3:
bzip2
block sizes
bzip2
has some block size options. From the manual page bzip2(1)
:
-1 (or --fast) to -9 (or --best)
Set the block size to 100 k, 200 k .. 900 k when compressing.
Has no effect when decompressing. See MEMORY MANAGEMENT below.
The --fast and --best aliases are primarily for GNU gzip
compatibility. In particular, --fast doesn't make things
significantly faster. And --best merely selects the default
behaviour.
As you want faster compression with less regards to compression ratio, using bzip2
, you seem to want the -1
(or --fast
) option.
Setting bzip2
block size when using tar
You can set bzip2
block size when using tar
in a couple of ways.
The UNlX way
My favorite way, the UNlX way, is one where you use every tool independently, and combine them through pipes.
$ tar --create [FILE...] | bzip2 -1 > [ARCHIVE].tar.bz2
You can read that as "create .tar with tar
-> bzip it with bzip2
-> write it to [ARCHIVE].tar.bz2
".
Environment variable
It is also possible to set bzip2
options through the environment variable BZIP2
. From the manual page bzip2(1)
:
bzip2 will read arguments from the environment variables BZIP2 and BZIP,
in that order, and will process them before any arguments read from the
command line. This gives a convenient way to supply default arguments.
So to use that with tar
, you could for example do:
$ BZIP2=-1 tar --create --bzip2 --file [ARCHIVE].tar.bz2 [FILE...]
Faster alternatives
bzip2
uses a slow compression algorithm. If you are concerned about speed, you could investigate alternative algorithms, such as those used by gzip
or lzop
. Here is a nice article comparing compression tools: https://aliver.wordpress.com/2010/06/22/huge-unix-file-compresser-shootout-with-tons-of-datagraphs/
Solution 4:
Send the tar
output to stdout
and then pipe it through bzip2
separately:
% tar cvf - _file_ | bzip2 _opts_ > output.tar.bz2