What are the implications of asking Reflection APIs to overwrite System.String.Empty?

I stumbled upon this code:

static void Main()
{
    typeof(string).GetField("Empty").SetValue(null, "evil");//from DailyWTF

    Console.WriteLine(String.Empty);//check

    //how does it behave?
    if ("evil" == String.Empty) Console.WriteLine("equal"); 

    //output: 
    //evil 
    //equal

 }

and I wonder how is it even possible to compile this piece of code. My reasoning is:

According to MSDN String.Empty is read-only therefore changing it should be impossible and compiling should end with "A static readonly field cannot be assigned to" or similar error.

I thought Base Class Library assemblies are somehow protected and signed and whatnot to prevent exactly this kind of attack. Next time someone may change System.Security.Cryptography or another critical class.

I thought Base Class Library assemblies are compiled by NGEN after .NET installation therefore changing fields of String class should require advanced hacking and be much harder.

And yet this code compiles and works. Can somebody please explain what is wrong with my reasoning?


Solution 1:

A static readonly field cannot be assigned to

You're not assigning to it. You're calling public functions in the System.Reflection namespace. No reason for the compiler to complain about that.

Besides, typeof(string).GetField("Empty") could use variables entered in by the user instead, there's no sure way for the compiler to tell in all cases whether the argument to GetField will end up being "Empty".

I think you're wanting Reflection to see that the field is marked initonly and throw an error at runtime. I can see why you would expect that, yet for white-box testing, even writing to initonly fields has some application.

The reason NGEN has no effect is that you're not modifying any code here, only data. Data is stored in memory with .NET just as with any other language. Native programs may use readonly memory sections for things like string constants, but the pointer to the string is generally still writable and that is what is happening here.

Note that your code must be running with full-trust to use reflection in this questionable way. Also, the change only affect one program, this isn't any sort of a security vulnerability as you seem to think (if you're running malicious code inside your process with full trust, that design decision is the security problem, not reflection).


Further note that the values of initonly fields inside mscorlib.dll are global invariants of the .NET runtime. After breaking them, you can't even reliably test whether the invariant was broken, because the code to inspect the current value of System.String.Empty has also broken, because you've violated its invariants. Start violating system invariants and nothing can be relied on.

By specifying these values inside the .NET specifications, it enables the compiler to implement a whole bunch of performance optimizations. Just a simple one is that

s == System.String.Empty

and

(s != null) && (s.Length == 0)

are equivalent, but the latter is much faster (relatively speaking).

Also the compiler can determine that

if (int.Parse(s) > int.MaxValue)

is never true, and generate an unconditional jump to the else block (it still has to call Int32.Parse to have the same exception behavior, but the comparison can be removed).

System.String.Empty is also used extensively inside BCL implementations. If you overwrite it, all sorts of crazy things can happen, including damage that leaks outside your program (for example you might write to a file whose name is built using string manipulation... when string breaks, you might overwrite the wrong file)


And the behavior might easily differ between .NET versions. Normally when new optimization opportunities are found, they don't get backported to previous versions of the JIT compiler (and even if they were, there could be installations from before the backport was implemented). In particular. String.Empty-related optimizations are observably different between .NET 2.x and Mono and .NET 4.5+.

Solution 2:

The code compiles because every line of the code is perfectly legal C#. What specific line do you think should be a syntax error? There is no line of code there that assigns to a readonly field. There's a line of code that calls a method in Reflection that assigns to a readonly field, but that's already compiled, and ultimately the thing that breaks security in there wasn't even written in C#, it was written in C++. It's part of the runtime engine itself.

The code runs successfully because full trust means full trust. You are running your code in a full trust environment, and since full trust means full trust, the runtime is assuming that you know what you're doing when you do this stupid dangerous thing.

If you try running your code in a partially trusted environment then you'll see that Reflection throws a "you're not allowed to do that" exception.

And yes, the assemblies are signed and whatnot. If you're running fully-trust code, then sure, they can screw around with those assemblies as much as they want. That's what full trust means. Partially trusted code doesn't get to do that but fully trusted code can do anything you can do. Only grant full trust to code you actually trust to not do crazy things on your behalf.

Solution 3:

Reflection allows you to defy laws of physics do anything. You can even set the value of private members.

Reflection does not follow rules, you can read about it on MSDN.

Another example: Can I change a private readonly field in C# using reflection?


If you are on a web application you can set the trust level of your application.

level="[Full|High|Medium|Low|Minimal]"

These are the restriction of the trust level, accondingly to MSDN, at medium trust you restrict reflection access.

Edit: DO NOT run a web application other than Full Trust, this is a direct recommendation from ASP.NET team. To protect your application create one App Pool for each website.


Also, it is not recommended to go wild using reflection for everything. It has the right place and time to be used.

Solution 4:

A point nobody mentioned here before: This piece of code causes different behavior on different .net implementations/platforms. Actually on Mono it returns nothing at all: see IDE One (Mono 2.8), my local Mono 2.6.7 (linux) produces the same "output".

I haven't yet looked at the low level code, but I suppose it's compiler specific just as mentioned by PRASHANT P or the runtime environment.

UPDATE

Running the Mono compiled exe on Windows (MS Dotnet 4) produces

evil 
equal

Running the Windows compiled exe on linux was not possible (Dotnet 4...) so I recompiled with Dotnet 2 (it still says evil and equal on Windows). There is "no" output. Of course there must be at least the "\n" from the first WriteLine, in fact it is there. I piped the output to a file and started up my hexeditor to look at a single character 0x0A.

To cut a long story short: It seems to be runtime environment specific.