Best practice to replace unknown chars from unknown charsets in filenames?
Solution 1:
In theory it can be tricky to know the character encoding used by the files, but in most cases the error comes from windows systems and programs still using just Latin1 instead of UTF-8. Run convmv -f cp850 -t utf-8 *
without quotes in the folder with the broken files and have a try.
(You need convmv
package installed)
Solution 2:
If you just want to get rid of some characters, you could try this:
rename "s/[^A-Za-z0-9-_]/_/g"
That would replace every character that is not just char, number or dash with an underscore. Run with the -n
option to see what is happening in a dry-run.