How can I transform string to UTF-8 in C#?

Solution 1:

As you know the string is coming in as Encoding.Default you could simply use:

byte[] bytes = Encoding.Default.GetBytes(myString);
myString = Encoding.UTF8.GetString(bytes);

Another thing you may have to remember: If you are using Console.WriteLine to output some strings, then you should also write Console.OutputEncoding = System.Text.Encoding.UTF8;!!! Or all utf8 strings will be outputed as gbk...

Solution 2:

string utf8String = "Acción";
string propEncodeString = string.Empty;

byte[] utf8_Bytes = new byte[utf8String.Length];
for (int i = 0; i < utf8String.Length; ++i)
{
   utf8_Bytes[i] = (byte)utf8String[i];
}

propEncodeString = Encoding.UTF8.GetString(utf8_Bytes, 0, utf8_Bytes.Length);

Output should look like

Acción

day’s displays day's

call DecodeFromUtf8();

private static void DecodeFromUtf8()
{
    string utf8_String = "day’s";
    byte[] bytes = Encoding.Default.GetBytes(utf8_String);
    utf8_String = Encoding.UTF8.GetString(bytes);
}

Solution 3:

Your code is reading a sequence of UTF8-encoded bytes, and decoding them using an 8-bit encoding.

You need to fix that code to decode the bytes as UTF8.

Alternatively (not ideal), you could convert the bad string back to the original byte array—by encoding it using the incorrect encoding—then re-decode the bytes as UTF8.

Solution 4:

 Encoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(mystring));

Solution 5:

@anothershrubery answer worked for me. I've made an enhancement using StringEntensions Class so I can easily convert any string at all in my program.

Method:

public static class StringExtensions
{
    public static string ToUTF8(this string text)
    {
        return Encoding.UTF8.GetString(Encoding.Default.GetBytes(text));
    }
}

Usage:

string myString = "Acción";
string strConverted = myString.ToUTF8();

Or simply:

string strConverted = "Acción".ToUTF8();