How can I avoid broken languages when unzipping unicode files?

It happens often that I see unknown file names when uncompressing zip files.

For example,

 ╕╢╣¤└╟╝║-Bb└╠┴╢╛╟▒т┐ы-┼м╢є╕о│▌,┼╫│╩╗Ў╝╥╞∙ ╝╥╟┴╢є│ы╗Ў╝╥╞∙ ,╞о╖│╞ъ

What could be the solution for the problem?


Try using p7zip. (@Pilot6 mentioned that p7zip doesn't work well for non UTF-8 encodings, but if you only need for UTF-8, then that's an easy solution.)

apt-get update
apt-get install p7zip-full
7z x thefile.zip -o"outputDir"

Korean MS Windows encoding cp-949 may be used to zip the original files. Try unzip with Windows Encoding option.

unzip -O cp-949 <file.zip>

Note: I checked the profile of original poster to know from where he/she is (Seoul, South Korea). For other users, you have to check the origin of zip and change the encoding respectively.


If you use standard Ubuntu Archive Manager and Ubuntu version 14.04+, then this can be solved by installing patched Archive Manager (file-roller). The problem is that file-roller uses p7zip for unzipping zip archives, if p7zip-full is installed. But p7zip does not handle non-UTF-8 encodings well. I patched file-roller to always use unzip for that purpose. Unzip itself has been fixed. The patched file-roller can be installed from my ppa

sudo add-apt-repository ppa:hanipouspilot/file-roller
sudo apt-get update
sudo apt-get install file-roller