What's the difference between .tar.gz and .gz, or .tar.7z and .7z?
If you come from a Windows background, you may be familiar with the zip and rar formats. These are archives of multiple files compressed together.
In Unix and Unix-like systems (like Ubuntu), archiving and compression are separate.
-
tar
puts multiple files into a single (tar) file. -
gzip
compresses one file (only).
So, to get a compressed archive, you combine the two, first use tar
or pax
to get all files into a single file (archive.tar
), then gzip
it (archive.tar.gz
).
If you have only one file, you need to compress (notes.txt
): there's no need for tar
, so you just do gzip notes.txt
which will result in notes.txt.gz
. There are other types of compression, such as compress
, bzip2
and xz
which work in the same manner as gzip
(apart from using different types of compression of course).
It depends on what you are looking for... Compression or archiving?
When I talk about archiving, I mean preserving permissions, directory structure, etc...
Compression may ignore most of that and just get your files in a smaller packages.
To preserve file permissions, use tar:
tar cpvf backup.tar folder
The p flag will save file permissions. Use the z flag for gzip compression or the j flag for bzip compression.
tar czpvf backup.tar.gz folder #backup.tgz is acceptable as well
tar cjpvf backup.tar.bz2 folder #backup.tbz2 works too
If you want to have a tar file you can "update" package the tar using the P flag:
tar cpPvf backup.tar folder
Then to update, replace 'c' with 'u' and when unpacking, you can use 'k' to preserve files that already exist.
tar upPvf backup.tar folder #updating a tar file
tar xpPkvf backup.tar #extracting a tar with permissions(p) and not extracting(k) files that exist on disk already
The P flag saves files with full paths, so - /home/username vs home/username (notice the leading forward slash).
7z compression offers greater compression, but does not preserve file ownership, permissions, etc. Rzip is another compression utility that offers comparable compression with 7z as well.
I guess a backup.tar.7z file is just a tar file (with permissions) compressed by a 7z file, though I wouldn't be surprised if little compression occurred because 7z may not be able to dump the file metadata. It's 7z's ability to exclude the file metadata that it can offer great compression (amongst other things of course).
Compression depends entirely on data type as well. Some files don't compress well because they may already be compressed with some other means (ie, .mp3, .jpg, .tiff/with lzma, .rpm, etc).
gzip or bzip2 doesn't know about file system
- file name, directory, or tree structure. It just compresses input stream, then output result. Even gzip or bzip2 can't archive directories on their own, that is why it is usually combined with tar.
tar(archiver) - just archive file structure. gzip,bzip2(compressor) - just compress input.
I think this strategy came from 'do one thing well' Unix philosophy. Tar works well? Leave it as is. Need more compression ratio than gzip? Here is bzip2 or 7zip.