tr: convert apostrophe to ASCII

I'm trying to convert a Right Single Quotation Mark to an Apostrophe using tr.

tr "`echo -e '\xE2\x80\x99'`" "`echo -e '\x27'`" < a > b

given a UTF-8 encoded file called a which contains this example:

We’re not a different species
“All alone?” Jeth mentioned.

OS X uses the BSD tr and produces a nice result:

We're not a different species
“All alone?” Jeth mentioned.

Ubuntu uses the GNU tr and produces this nasty result:

We'''re not a different species
''<9C>All alone?''<9D> Jeth mentioned.

How can I accomplish this conversion in Ubuntu?


You could try some other tool, like sed:

$ sed "s/’/'/g" <a
We're not a different species
“All alone?” Jeth mentioned.

Or, since we are doing simple translation, use the y command for sed:

$ sed "y/’/'/" <a
We're not a different species
“All alone?” Jeth mentioned.

GNU tr doesn't work presumably because:

Currently tr fully supports only single-byte characters. Eventually it will support multibyte characters; when it does, the -C option will cause it to complement the set of characters, whereas -c will cause it to complement the set of values. This distinction will matter only when some values are not characters, and this is possible only in locales using multibyte encodings when the input contains encoding errors.

And is a multibyte character:

$ echo -n \' | wc -c
1
$ echo -n ’ | wc -c  
3

If you also want to convert the double quotes, and perhaps other characters, you could use GNU iconv:

$ iconv -f utf-8 -t ascii//translit < a
We're not a different species
"All alone?" Jeth mentioned.

The //TRANSLIT suffix tells iconv that for characters outside the repertoire of the target encoding (here ASCII), it can substitute similar-looking characters or sequences automatically. Without the suffix, iconv will give up as soon as it finds an untranslatable character.

Note that //TRANSLIT seems to be a GNU extension: POSIX iconv doesn't support it.