How is the boxing/unboxing behavior of Nullable<T> possible?
Something just occurred to me earlier today that has got me scratching my head.
Any variable of type Nullable<T>
can be assigned to null
. For instance:
int? i = null;
At first I couldn't see how this would be possible without somehow defining an implicit conversion from object
to Nullable<T>
:
public static implicit operator Nullable<T>(object box);
But the above operator clearly does not exist, as if it did then the following would also have to be legal, at least at compile-time (which it isn't):
int? i = new object();
Then I realized that perhaps the Nullable<T>
type could define an implicit conversion to some arbitrary reference type that can never be instantiated, like this:
public abstract class DummyBox
{
private DummyBox()
{ }
}
public struct Nullable<T> where T : struct
{
public static implicit operator Nullable<T>(DummyBox box)
{
if (box == null)
{
return new Nullable<T>();
}
// This should never be possible, as a DummyBox cannot be instantiated.
throw new InvalidCastException();
}
}
However, this does not explain what occurred to me next: if the HasValue
property is false
for any Nullable<T>
value, then that value will be boxed as null
:
int? i = new int?();
object x = i; // Now x is null.
Furthermore, if HasValue
is true
, then the value will be boxed as a T
rather than a T?
:
int? i = 5;
object x = i; // Now x is a boxed int, NOT a boxed Nullable<int>.
But this seems to imply that there is a custom implicit conversion from Nullable<T>
to object
:
public static implicit operator object(Nullable<T> value);
This is clearly not the case as object
is a base class for all types, and user-defined implicit conversions to/from base types are illegal (as well they should be).
It seems that object x = i;
should box i
like any other value type, so that x.GetType()
would yield the same result as typeof(int?)
(rather than throw a NullReferenceException
).
So I dug around a bit and, sure enough, it turns out this behavior is specific to the Nullable<T>
type, specially defined in both the C# and VB.NET specifications, and not reproducible in any user-defined struct
(C#) or Structure
(VB.NET).
Here's why I'm still confused.
This particular boxing and unboxing behavior appears to be impossible to implement by hand. It only works because both C# and VB.NET give special treatment to the Nullable<T>
type.
Isn't it theoretically possible that a different CLI-based language could exist where
Nullable<T>
weren't given this special treatment? And wouldn't theNullable<T>
type therefore exhibit different behavior in different languages?How do C# and VB.NET achieve this behavior? Is it supported by the CLR? (That is, does the CLR allow a type to somehow "override" the manner in which it is boxed, even though C# and VB.NET themselves prohibit it?)
Is it even possible (in C# or VB.NET) to box a
Nullable<T>
asobject
?
There are two things going on:
1) The compiler treats "null" not as a null reference but as a null value... the null value for whatever type it needs to convert to. In the case of a Nullable<T>
it's just the value which has False for the HasValue
field/property. So if you have a variable of type int?
, it's quite possible for the value of that variable to be null
- you just need to change your understanding of what null
means a little bit.
2) Boxing nullable types gets special treatment by the CLR itself. This is relevant in your second example:
int? i = new int?();
object x = i;
the compiler will box any nullable type value differently to non-nullable type values. If the value isn't null, the result will be the same as boxing the same value as a non-nullable type value - so an int?
with value 5 gets boxed in the same way as an int
with value 5 - the "nullability" is lost. However, the null value of a nullable type is boxed to just the null reference, rather than creating an object at all.
This was introduced late in the CLR v2 cycle, at the request of the community.
It means there's no such thing as a "boxed nullable-value-type value".
You got it right: Nullable<T>
gets special treatment from the compiler, both in VB and C#. Therefore:
- Yes. The language compiler needs to special-case
Nullable<T>
. - The compiler refactors usage of
Nullable<T>
. The operators are just syntactic sugar. - Not that I know of.
I was asking myself the same question and I was also expecting to have some implicit operator for Nullable<T>
in .net Nullable source code so I looked at what is the IL code corresponding to int? a = null;
to understand what is happening behind the scene:
c# code:
int? a = null;
int? a2 = new int?();
object a3 = null;
int? b = 5;
int? b2 = new int?(5);
IL code (generated with LINQPad 5):
IL_0000: nop
IL_0001: ldloca.s 00 // a
IL_0003: initobj System.Nullable<System.Int32>
IL_0009: ldloca.s 01 // a2
IL_000B: initobj System.Nullable<System.Int32>
IL_0011: ldnull
IL_0012: stloc.2 // a3
IL_0013: ldloca.s 03 // b
IL_0015: ldc.i4.5
IL_0016: call System.Nullable<System.Int32>..ctor
IL_001B: ldloca.s 04 // b2
IL_001D: ldc.i4.5
IL_001E: call System.Nullable<System.Int32>..ctor
IL_0023: ret
We see that the compiler change int? a = null
to something like int? a = new int?()
which is quite different to object a3 = null
. So clearly Nullables have a special compiler treatment.