How to convert UTF-8 byte[] to string
I have a byte[]
array that is loaded from a file that I happen to known contains UTF-8.
In some debugging code, I need to convert it to a string. Is there a one-liner that will do this?
Under the covers it should be just an allocation and a memcopy, so even if it is not implemented, it should be possible.
Solution 1:
string result = System.Text.Encoding.UTF8.GetString(byteArray);
Solution 2:
There're at least four different ways doing this conversion.
Encoding's GetString
, but you won't be able to get the original bytes back if those bytes have non-ASCII characters.BitConverter.ToString
The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.Convert.ToBase64String
You can easily convert the output string back to byte array by usingConvert.FromBase64String
.
Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.HttpServerUtility.UrlTokenEncode
You can easily convert the output string back to byte array by usingHttpServerUtility.UrlTokenDecode
. The output string is already URL friendly! The downside is it needsSystem.Web
assembly if your project is not a web project.
A full example:
byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters
string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1); // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results
string s2 = BitConverter.ToString(bytes); // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes
string s3 = Convert.ToBase64String(bytes); // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes
string s4 = HttpServerUtility.UrlTokenEncode(bytes); // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes