Can't unzip 6 GB file: bad zipfile offset
I need to unzip a file of about 6 GB. However, I can't do it neither right-clicking (it gives an error saying "empty archive") or in the terminal, showing the following for the latter:
$ ls
fhs-3.0.pdf IDrive Tese.zip 'Transport Studies of Dual-Gated ABC and ABA Trilayer Graphene: Band Gap Opening and Band Structure Tuning in Very Large Perpendicular Electric Fields - Zou.pdf'
$ unzip Tese.zip
Archive: Tese.zip
warning [Tese.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 4294967296
(attempting to re-compensate)
extracting: Tese/.DS_Store
error: not enough memory for bomb detection
$ jar xvf Tese.zip
java.util.zip.ZipException: only DEFLATED entries can have EXT descriptor
at java.base/java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:313)
at java.base/java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:125)
at jdk.jartool/sun.tools.jar.Main.extract(Main.java:1361)
at jdk.jartool/sun.tools.jar.Main.run(Main.java:409)
at jdk.jartool/sun.tools.jar.Main.main(Main.java:1681)
I don't think the file is corrupt because I unzipped the same file in a Mac recently.
My PC has 15 GB of RAM and 2 GB of swap memory.
Java only supports file in the ZIP format, not ZIP64. The same goes for the simple zip
command
Zip files have a size limit of 4GB minus 1 byte, Zip64 have a limit of 16 EB minus 1 byte. MacOS makes zip64 files when the size of bigger than supported. The tool your desktop environment gives may have the same limitation
You need the proper tooling to read zip64 files. On the command line ditto
can read those files, while on Java, you can use Apache Commons Compress
Both tools, unzip
and jar
, explicitly told you that the file is corrupt:
miguel@...$ unzip Tese.zip
...
warning [Tese.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 4294967296
(attempting to re-compensate)
miguel@...$ jar xvf Tese.zip
java.util.zip.ZipException: only DEFLATED entries can have EXT
...
If you could unzip it on a Mac, it might be possible that some Mac software created that ZIP file, but maybe in a nonstandard format (as far as ZIP file standardization goes) that only some Mac software can understand.
You could try to go to that Mac again, unzip it there and then create a compressed tar file from it, then take that tar file to your PC and unpack it.
You can test if your copy unzip
supports the zip64 extension (and archives >4Gig) by running unzip -v
and checking for the presence of ZIP64_SUPPORT
. Below is what I get on my Ubuntu setup. Note it does have ZIP64_SUPPORT
If your unzip
doesn't display ZIP64_SUPPORT
, you need a newer version of unzip
.
Alternatively, if your unzip
does have ZIP64_SUPPORT
the problem is something else. Can you share any details about how the zip file was created? Is the zip file publicly available anywhere?
$ unzip -v
$ unzip -v
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
Latest sources and executables are at ftp://ftp.info-zip.org/pub/infozip/ ;
see ftp://ftp.info-zip.org/pub/infozip/UnZip.html for other sites.
Compiled with gcc 9.2.0 for Unix (Linux ELF).
UnZip special compilation options:
ACORN_FTYPE_NFS
COPYRIGHT_CLEAN (PKZIP 0.9x unreducing method not supported)
SET_DIR_ATTRIB
SYMLINKS (symbolic links supported, if RTL and file system permit)
TIMESTAMP
UNIXBACKUP
USE_EF_UT_TIME
USE_UNSHRINK (PKZIP/Zip 1.x unshrinking method supported)
USE_DEFLATE64 (PKZIP 4.x Deflate64(tm) supported)
UNICODE_SUPPORT [wide-chars, char coding: UTF-8] (handle UTF-8 paths)
LARGE_FILE_SUPPORT (large files over 2 GiB supported)
ZIP64_SUPPORT (archives using Zip64 for large files supported)
USE_BZIP2 (PKZIP 4.6+, using bzip2 lib version 1.0.8, 13-Jul-2019)
VMS_TEXT_CONV
WILD_STOP_AT_DIR
[decryption, version 2.11 of 05 Jan 2007]
UnZip and ZipInfo environment options:
UNZIP: [none]
UNZIPOPT: [none]
ZIPINFO: [none]
ZIPINFOOPT: [none]