How is the boxing/unboxing behavior of Nullable<T> possible?

Something just occurred to me earlier today that has got me scratching my head.

Any variable of type Nullable<T> can be assigned to null. For instance:

int? i = null;

At first I couldn't see how this would be possible without somehow defining an implicit conversion from object to Nullable<T>:

public static implicit operator Nullable<T>(object box);

But the above operator clearly does not exist, as if it did then the following would also have to be legal, at least at compile-time (which it isn't):

int? i = new object();

Then I realized that perhaps the Nullable<T> type could define an implicit conversion to some arbitrary reference type that can never be instantiated, like this:

public abstract class DummyBox
{
    private DummyBox()
    { }
}

public struct Nullable<T> where T : struct
{
    public static implicit operator Nullable<T>(DummyBox box)
    {
        if (box == null)
        {
            return new Nullable<T>();
        }

        // This should never be possible, as a DummyBox cannot be instantiated.
        throw new InvalidCastException();
    }
}

However, this does not explain what occurred to me next: if the HasValue property is false for any Nullable<T> value, then that value will be boxed as null:

int? i = new int?();
object x = i; // Now x is null.

Furthermore, if HasValue is true, then the value will be boxed as a T rather than a T?:

int? i = 5;
object x = i; // Now x is a boxed int, NOT a boxed Nullable<int>.

But this seems to imply that there is a custom implicit conversion from Nullable<T> to object:

public static implicit operator object(Nullable<T> value);

This is clearly not the case as object is a base class for all types, and user-defined implicit conversions to/from base types are illegal (as well they should be).

It seems that object x = i; should box i like any other value type, so that x.GetType() would yield the same result as typeof(int?) (rather than throw a NullReferenceException).

So I dug around a bit and, sure enough, it turns out this behavior is specific to the Nullable<T> type, specially defined in both the C# and VB.NET specifications, and not reproducible in any user-defined struct (C#) or Structure (VB.NET).

Here's why I'm still confused.

This particular boxing and unboxing behavior appears to be impossible to implement by hand. It only works because both C# and VB.NET give special treatment to the Nullable<T> type.

Isn't it theoretically possible that a different CLI-based language could exist where Nullable<T> weren't given this special treatment? And wouldn't the Nullable<T> type therefore exhibit different behavior in different languages?
How do C# and VB.NET achieve this behavior? Is it supported by the CLR? (That is, does the CLR allow a type to somehow "override" the manner in which it is boxed, even though C# and VB.NET themselves prohibit it?)
Is it even possible (in C# or VB.NET) to box a Nullable<T> as object?

There are two things going on:

1) The compiler treats "null" not as a null reference but as a null value... the null value for whatever type it needs to convert to. In the case of a Nullable<T> it's just the value which has False for the HasValue field/property. So if you have a variable of type int?, it's quite possible for the value of that variable to be null - you just need to change your understanding of what null means a little bit.

2) Boxing nullable types gets special treatment by the CLR itself. This is relevant in your second example:

    int? i = new int?();
    object x = i;

the compiler will box any nullable type value differently to non-nullable type values. If the value isn't null, the result will be the same as boxing the same value as a non-nullable type value - so an int? with value 5 gets boxed in the same way as an int with value 5 - the "nullability" is lost. However, the null value of a nullable type is boxed to just the null reference, rather than creating an object at all.

This was introduced late in the CLR v2 cycle, at the request of the community.

It means there's no such thing as a "boxed nullable-value-type value".

You got it right: Nullable<T> gets special treatment from the compiler, both in VB and C#. Therefore:

Yes. The language compiler needs to special-case Nullable<T>.
The compiler refactors usage of Nullable<T>. The operators are just syntactic sugar.
Not that I know of.

I was asking myself the same question and I was also expecting to have some implicit operator for Nullable<T> in .net Nullable source code so I looked at what is the IL code corresponding to int? a = null; to understand what is happening behind the scene:

c# code:

int? a = null;
int? a2 = new int?();
object a3 = null;
int? b = 5;
int? b2 = new int?(5);

IL code (generated with LINQPad 5):

IL_0000:  nop         
IL_0001:  ldloca.s    00 // a
IL_0003:  initobj     System.Nullable<System.Int32>
IL_0009:  ldloca.s    01 // a2
IL_000B:  initobj     System.Nullable<System.Int32>
IL_0011:  ldnull      
IL_0012:  stloc.2     // a3
IL_0013:  ldloca.s    03 // b
IL_0015:  ldc.i4.5    
IL_0016:  call        System.Nullable<System.Int32>..ctor
IL_001B:  ldloca.s    04 // b2
IL_001D:  ldc.i4.5    
IL_001E:  call        System.Nullable<System.Int32>..ctor
IL_0023:  ret

We see that the compiler change int? a = null to something like int? a = new int?() which is quite different to object a3 = null. So clearly Nullables have a special compiler treatment.

How is the boxing/unboxing behavior of Nullable<T> possible?

Related

Recent Posts