Will adding a file to a zip file rewrite the whole file?

If I have a large zip file and add a file into it, will Windows create a copy of the file and then delete the original, causing very large "write amplification", or will it just add to the file?

(Perhaps it matters if it's an SSD or HDD?)


Solution 1:

Although the Zip file format was defined to allow fast appends without recopying the whole archive, I don't know of a Zip/7Zip program that does it.

The Zip archive contains an inner directory that allows direct access to any included item. The directory is stored at the end of the archive, and all items including the directory are identified by a header.

In principle, appending a new item to the archive can be done by appending it to the archive file, then appending a new directory. However, a computer failure before the new directory was completely written and flushed out to the disk, may leave the archive with rubbish at its end.

While in theory this can still be recovered by scanning from the end of the archive for the header of the last written directory, a program that just expects to find the directory at the end of the file will fail, announcing a corrupt archive.

I have tested appending a file to an archive for two Zip products and for 7Zip, querying the archive's disk address before and after the operation. I used for that the fsutil file queryextents command.

None of the three products has attempted to optimize the append operation. All three have recopied the entire archive when appending the new item.

My conclusion is that, while such an optimizing product may exist, it is best to test whether yours is one such. Without testing, the default assumption should be that the archive will be recopied.