Fix encoding of German umlauts in directories and filenames (ü = u╠ê and so on)

The reason that you're getting the "already UTF-8" warning is that those strings are really already in UTF-8. The "ü" character was encoded OSX-style as a 'u' followed by the two bytes "\xCC" and "\x88". These two bytes together make up the UTF-8 representation of \u0308, the combining diaeresis.

If you look at the code page 437 listing here, you'll see the \xCC character as "╠" and the \x88 character as "ê".

Whatever it is that you're using to display those character sequences is not interpreting them as UTF-8 but rather as CP437.

A quick proof, if you read ruby, that displays as expected in my UTF-8 terminal:

$ ruby -e 'puts "u\xCC\x88"' | iconv -f cp437 -t utf-8
ü
$ ruby -e 'puts "u\xCC\x88"'
ü

Warning: Draft!

I tried to fix this with "detox" but couldn't find a way to chain characters together. Based on the answer of @S2VpdGgA I made this compendium.

Because I preview all I do (with the echo) it is safe for me on those rare occasions.

But really somebody may want to do this properly. There may be loads of other cases like "é", "à", "è" etc...

######
# Preparation and tests
# You may need to extract your own character group from your filename, if this gets lost via this web form.
# My reconstruction as follows:

# note: the echo sends the chain of chars, copied from the console.

echo 'ä' | perl -pe 's/([^x\0-\x7f])/"\\x" . sprintf "%x", ord $1/ge'
echo 'ä' | sed -e 's/a\xcc\x88/ä/'

echo 'ö' | perl -pe 's/([^x\0-\x7f])/"\\x" . sprintf "%x", ord $1/ge'
echo 'ö' | sed -e 's/o\xcc\x88/ö/'

echo 'ü' | perl -pe 's/([^x\0-\x7f])/"\\x" . sprintf "%x", ord $1/ge'
echo 'ü' | sed -e 's/u\xcc\x88/ü/'

echo 'Ä' | perl -pe 's/([^x\0-\x7f])/"\\x" . sprintf "%x", ord $1/ge'
echo 'Ä' | sed -e 's/A\xcc\x88/Ä/'

echo 'Ö' | perl -pe 's/([^x\0-\x7f])/"\\x" . sprintf "%x", ord $1/ge'
echo 'Ö' | sed -e 's/O\xcc\x88/Ö/'

echo 'Ü' | perl -pe 's/([^x\0-\x7f])/"\\x" . sprintf "%x", ord $1/ge'
echo 'Ü' | sed -e 's/U\xcc\x88/Ü/'

# Final version
# test all at once
echo 'Ä' | sed -e 's/a\xcc\x88/ä/' | sed -e 's/o\xcc\x88/ö/' | sed -e 's/u\xcc\x88/ü/' | sed -e 's/A\xcc\x88/Ä/' | sed -e 's/O\xcc\x88/Ö/' | sed -e 's/U\xcc\x88/Ü/'


# wrap into a recursion
# note: not recursive as-is because folder can change

cd /path/to/dir
find . -maxdepth 1 | while read FILE ; do
    newfile="$(echo ${FILE} | sed -e 's/a\xcc\x88/ä/' | sed -e 's/o\xcc\x88/ö/' | sed -e 's/u\xcc\x88/ü/' | sed -e 's/A\xcc\x88/Ä/' | sed -e 's/O\xcc\x88/Ö/' | sed -e 's/U\xcc\x88/Ü/')" ;
    echo mv -T "${FILE}" "${newfile}";
done 

# (remove the 'echo ' to actually make changes)
#######