Recover files from corrupt archive

I have a tar file of 40Gb but when I open it with the archive manager on Ubuntu I get only 512MB. The major part of my files are messing. Even when I do:

tar -tf myfile.tar.gz

I don't get the list of all my files. When I create the file I used the command:

tar -zcvf myfile.tar.gz myhomefolder

And then I stopped the command as I don't want to compress (it takes too long) and run:

tar cvf myfile.tar.gz myhomefolder

On Windows when I use 7-Zip I get this error:

There are data after the end of file

The problem is that there's a bad end in the file. How to do to remove that end. What are the software that can make open the file in binary mode or something like this.

How can I get my files?


You could try the utility gzrecover and then extract the recovered archive with cpio.

It has assisted with recovering corrupted archives in the past, but your mileage may vary. I'm having an issue where it can't recover an archive for me, so fair warning.


EDIT: By a mod threatening me violently with a knife, I'll add some details how I recovered archives beforehand. (I'm kidding. Don't hurt me, @Scott!)

You can install both packages by using

sudo apt-get install cpio gzrt

After you've ensured both packages are installed, lets get to work.

First, put your archive where you don't mind the data being extracted. From there, you can run

gzrecover broken-archive.tar.gz

gzrecover will likely take some time, depending on how large the archive is. However, with any luck, you'll see "broken-archive.tar.recovered" in the same directory after the operation has completed.

The recovered archive has likely sustained some kind of damage as well; this is where cpio comes in handy. It should already be installed on your system.

To extract the recovered archive, use

cpio -F broken-archive.tar.recovered -i -v

Be aware that cpio will spew out an extensively large amount of information during this time. I recommend you do this on a separate tty entirely instead of a terminal on your X session. To do this, you can just press CTRL+ALT+F1 and your screen will switch to a terminal. To return to your X session, press CTRL+ALT+F8 (may vary, but usually is F7-9).

The process of extracting the recovered archive is also going to be rather intensive. As cpio is not optimized for use with multiple cores, it will stick to one core. Whatever core it happens to stick on, it will probably pin the core at 100% usage. If you have a dual-core machine, you might have to not use your computer for a while.

Be forewarned that this process could actually take several days especially considering the size of the archive. Either find a different machine to offload this work onto (slower CPU's will make this take longer), plan to not use your computer for a while, or just brace for your system choking during work. In addition, if you do run this within a terminal on your X session, you'll likely pin a second core to 100%. If you're an advanced user, you can use the taskset command to assign the process(es) to a specific core. However, that is outside of the scope of my answer.

I wish you the best of luck.