Recovering corrupted files uploaded in wrong FTP mode
Some files were uploaded in the wrong mode to the FTP (via command line). I believe I have some binary files that were uploaded in TEXT mode and now I cannot open them.
I dont have access to the original files, can I somehow recover from this? Is there some tool that will allow me to get the files in their correct format?
I recently had to face the same problem. Linux -> Windows, ASCII mode. I've finished writing a program in Python that allows for the recovery of ASCII transferred binaries. It's a byte bruteforcer, and here is how it works:
- Open damaged archive as byte stream.
- Find all occurrences of 0d followed by 0a (ASCII 13, ASCII 10).
- Remove all occurrences of 0d followed by 0a and store the byte addresses.
- Cycle through each of the addresses to restore a number of 0d's in case they were supposed to be there in the binary, restore and try to open (in my case I was dealing with bz2 archives, and had a CRC checksum algorithm check the integrity of the uncompressed data and match it with the one hardcoded into the archive).
The number of possible valid 0d 0a byte pairs in a binary will not be very high; the probability of a binary having a valid 0d 0a pair is quite low. The time a bz2 archive takes to fix with this bruteforce method is under 10 seconds for files under 100kb. I have not checked it with other types of files, but it is possible.
I am not going to paste the code here, since this question is not programming related, and this was a sort of competition task and I don't think I'm comfortable with taking the sources public, but if you do require it, please let me know.
Cheers, and Merry Christmas everyone! :)
Knowing whether it's possible to undo the destruction requires knowing the operating systems involved. The consequences depend on what combination of operating systems you use on server and client.
The worst problem is the end-of-line character. Windows use a carriage return (ASCII value 13) followed by a line feed (ASCII value 10) while Linux uses only line feed.
Text mode FTP transfer translates this. Binary mode doesn't. Which is where the destruction comes in.
If the transfer went from Windows to Linux, it would be impossible to determine whether a LF was originally a LF or a combination of CR-LF. As data is lost, undoing the destruction is next to impossible.