Converting string to byte array in C#

I'm converting something from VB into C#. Having a problem with the syntax of this statement:

if ((searchResult.Properties["user"].Count > 0))
{
    profile.User = System.Text.Encoding.UTF8.GetString(searchResult.Properties["user"][0]);
}

I then see the following errors:

Argument 1: cannot convert from 'object' to 'byte[]'

The best overloaded method match for 'System.Text.Encoding.GetString(byte[])' has some invalid arguments

I tried to fix the code based on this post, but still no success

string User = Encoding.UTF8.GetString("user", 0);

Any suggestions?


If you already have a byte array then you will need to know what type of encoding was used to make it into that byte array.

For example, if the byte array was created like this:

byte[] bytes = Encoding.ASCII.GetBytes(someString);

You will need to turn it back into a string like this:

string someString = Encoding.ASCII.GetString(bytes);

If you can find in the code you inherited, the encoding used to create the byte array then you should be set.


First of all, add the System.Text namespace

using System.Text;

Then use this code

string input = "some text"; 
byte[] array = Encoding.ASCII.GetBytes(input);

Hope to fix it!


Also you can use an Extension Method to add a method to the string type as below:

static class Helper
{
   public static byte[] ToByteArray(this string str)
   {
      return System.Text.Encoding.ASCII.GetBytes(str);
   }
}

And use it like below:

string foo = "bla bla";
byte[] result = foo.ToByteArray();

var result = System.Text.Encoding.Unicode.GetBytes(text);

Encoding.Default should not be used...

Some answers use Encoding.Default, however Microsoft raises a warning against it:

Different computers can use different encodings as the default, and the default encoding can change on a single computer. If you use the Default encoding to encode and decode data streamed between computers or retrieved at different times on the same computer, it may translate that data incorrectly. In addition, the encoding returned by the Default property uses best-fit fallback to map unsupported characters to characters supported by the code page. For these reasons, using the default encoding is not recommended. To ensure that encoded bytes are decoded properly, you should use a Unicode encoding, such as UTF8Encoding or UnicodeEncoding. You could also use a higher-level protocol to ensure that the same format is used for encoding and decoding.

To check what the default encoding is, use Encoding.Default.WindowsCodePage (1250 in my case - and sadly, there is no predefined class of CP1250 encoding, but the object could be retrieved as Encoding.GetEncoding(1250)).

...UTF-8 encoding should be used instead...

Encoding.ASCII in the most scoring answer is 7bit, so it doesn't work either, in my case:

byte[] pass = Encoding.ASCII.GetBytes("šarže");
Console.WriteLine(Encoding.ASCII.GetString(pass)); // ?ar?e

Following Microsoft's recommendation:

var utf8 = new UTF8Encoding();
byte[] pass = utf8.GetBytes("šarže");
Console.WriteLine(utf8.GetString(pass)); // šarže

Encoding.UTF8 recommended by others is an instance uf UTF-8 encoding and can be also used directly or as

var utf8 = Encoding.UTF8 as UTF8Encoding;

...but it is not used always

Default encoding is misleading: .NET uses UTF-8 everywhere (including strings hardcoded in the source code), but Windows actually uses 2 other non-UTF8 non-standard defaults: ANSI codepage (for GUI apps before .NET) and OEM codepage (aka DOS standard). These differs from country to country (for instance, Windows Czech edition uses CP1250 and CP852) and are oftentimes hardcoded in windows API libraries. So if you just set UTF-8 to console by chcp 65001 (as .NET implicitly does and pretends it is the default) and run some localized command (like ping), it works in English version, but you get tofu text in Czech Republic.

Let me share my real world experience: I created WinForms application customizing git scripts for teachers. The output is obtained on the background anynchronously by a process described by Microsoft as (bold text added by me):

The word "shell" in this context (UseShellExecute) refers to a graphical shell (ANSI CP) (similar to the Windows shell) rather than command shells (for example, bash or sh) (OEM CP) and lets users launch graphical applications or open documents (with messed output in non-US environment).

So effectively GUI defaults to UTF-8, process defaults to CP1250 and console defaults to 852. So the output is in 852 interpreted as UTF-8 interpreted as CP1250. I got tofu text from which I could not deduce the original codepage due to the double conversion. I was pulling my hair for a week to figure out to explicitly set UTF-8 for process script and convert the output from CP1250 to UTF-8 in the main thread. Now it works here in the Eastern Europe, but Western Europe Windows uses 1252. ANSI CP is not determined easily as many commands like systeminfo are also localized and other methods differs from version to version: in such environment displaying national characters reliably is almost unfeasible.

So until the half of 21st century, please DO NOT use any "Default Codepage" and set it explicitly (to UTF-8 if possible).