Why does gzip on tar output always produce different results?

Solution 1:

The header for the resulting gzip file is different depending on how it is called.

Gzip tries to store some origin information in the resulting file header. When called on normal files this includes the origin file name by default and a timestamp, which it gets from the original file.

When it is made to compress data piped to it, the origin is not as easy as with a normal file, so it resorts to a different naming and time stamp convention.

To prove this try adding the -n param to the offending lines in your example as...

~/temp$ tar c file | gzip -n > file1.tar.gz
~/temp$ tar c file | gzip -n > file.tar.gz
~/temp$ cmp file.tar.gz file1.tar.gz

Now the files are identical again...

From man gzip ...

   -n --no-name
          When  compressing,  do  not save the original file name and time
          stamp by default. (The original name is always saved if the name
          had  to  be  truncated.)  When decompressing, do not restore the
          original file name if present (remove only the gzip suffix  from
          the  compressed  file name) and do not restore the original time
          stamp if present (copy it from the compressed file). This option
          is the default when decompressing.

So the difference is indeed the original file name and time stamp information that is turned off by the -n param.

Solution 2:

Gzip files include a timestamp. If you create two gzip files at different times, these will different by the creation time, not by content.