Why does an implicit conversion operator from <T> to <U> accept <T?>?

This is a weird behaviour that I cannot make sense of. In my example I have a class Sample<T> and an implicit conversion operator from T to Sample<T>.

private class Sample<T>
{
   public readonly T Value;

   public Sample(T value)
   {
      Value = value;
   }

   public static implicit operator Sample<T>(T value) => new Sample<T>(value);
}

The problem occurs when using a nullable value type for T such as int?.

{
   int? a = 3;
   Sample<int> sampleA = a;
}

Here is the key part:
In my opinion this should not compile because Sample<int> defines a conversion from int to Sample<int> but not from int? to Sample<int>. But it compiles and runs successfully! (By which I mean the conversion operator is invoked and 3 will be assigned to the readonly field.)

And it gets even worse. Here the conversion operator isn't invoked and sampleB will be set to null:

{
   int? b = null;
   Sample<int> sampleB = b;
}

A great answer would probably be split into two parts:

Why does the code in the first snippet compile?
Can I prevent the code from compiling in this scenario?

Solution 1:

You can take a look at how compiler lowers this code:

int? a = 3;
Sample<int> sampleA = a;

into this:

int? nullable = 3;
int? nullable2 = nullable;
Sample<int> sample = nullable2.HasValue ? ((Sample<int>)nullable2.GetValueOrDefault()) : null;

Because Sample<int> is a class its instance can be assigned a null value and with such an implicit operator the underlying type of a nullable object can also be assigned. So assignments like these are valid:

int? a = 3;
int? b = null;
Sample<int> sampleA = a; 
Sample<int> sampleB = b;

If Sample<int> would be a struct, that of course would give an error.

EDIT: So why is this possible? I couldn't find it in spec because it's a deliberate spec violation and this is only kept for backwards compatibility. You can read about it in code:

DELIBERATE SPEC VIOLATION:
The native compiler allows for a "lifted" conversion even when the return type of the conversion not a non-nullable value type. For example, if we have a conversion from struct S to string, then a "lifted" conversion from S? to string is considered by the native compiler to exist, with the semantics of "s.HasValue ? (string)s.Value : (string)null". The Roslyn compiler perpetuates this error for the sake of backwards compatibility.

That's how this "error" is implemented in Roslyn:

Otherwise, if the return type of the conversion is a nullable value type, reference type or pointer type P, then we lower this as:
temp = operand
temp.HasValue ? op_Whatever(temp.GetValueOrDefault()) : default(P)

So according to spec for a given user-defined conversion operator T -> U there exists a lifted operator T? -> U? where T and U are non-nullable value types. However such logic is also implemented for a conversion operator where U is a reference type because of the above reason.

PART 2 How to prevent the code from compiling in this scenario? Well there is a way. You can define an additional implicit operator specifically for a nullable type and decorate it with an attribute Obsolete. That would require the type parameter T to be restricted to struct:

public class Sample<T> where T : struct
{
    ...

    [Obsolete("Some error message", error: true)]
    public static implicit operator Sample<T>(T? value) => throw new NotImplementedException();
}

This operator will be chosen as a first conversion operator for nullable type because it's more specific.

If you can't make such a restriction you must define each operator for each value type separately (if you are really determined you can take advantage of reflection and generating code using templates):

[Obsolete("Some error message", error: true)]
public static implicit operator Sample<T>(int? value) => throw new NotImplementedException();

That would give an error if referenced in any place in code:

Error CS0619 'Sample.implicit operator Sample(int?)' is obsolete: 'Some error message'

Solution 2:

I think it's lifted conversion operator in action. Specification says that:

Given a user-defined conversion operator that converts from a non-nullable value type S to a non-nullable value type T, a lifted conversion operator exists that converts from S? to T?. This lifted conversion operator performs an unwrapping from S? to S followed by the user-defined conversion from S to T followed by a wrapping from T to T?, except that a null valued S? converts directly to a null valued T?.

It looks like it's not applicable here, because while type S is value type here (int), type T is not value type (Sample class). However this issue in Roslyn repository states that it's actually a bug in specification. And Roslyn code documentation confirms this:

As mentioned above, here we diverge from the specification, in two ways. First, we only check for the lifted form if the normal form was inapplicable. Second, we are supposed to apply lifting semantics only if the conversion parameter and return types are both non-nullable value types.

In fact the native compiler determines whether to check for a lifted form on the basis of:

Is the type we are ultimately converting from a nullable value type?

Is the parameter type of the conversion a non-nullable value type?

Is the type we are ultimately converting to a nullable value type, pointer type, or reference type?

If the answer to all those questions is "yes" then we lift to nullable and see if the resulting operator is applicable.

If compiler would follow specification - it would produce a error in this case as you expect (and in some older versions it did), but now it does not.

So to summarize: I think compiler uses lifted form of your implicit operator, which should be impossible according to specification, but compiler diverges from specification here, because:

It is considered bug in specification, not in compiler.
Specification was already violated by old, pre-roslyn compiler, and it's good to maintain backwards compatibility.

As described in first quote describing how lifted operator works (with addition that we allow T to be reference type) - you may note it describes exactly what happens in your case. null valued S (int?) is assigned directly to T (Sample) without conversion operator, and non-null is unwrapped to int and run through your operator (wrapping to T? is obviously not needed if T is reference type).

Solution 3:

Why does the code in the first snippet compile?

A code sample from a source code of Nullable<T> that can be found here:

[System.Runtime.Versioning.NonVersionable]
public static explicit operator T(Nullable<T> value) {
    return value.Value;
}

[System.Runtime.Versioning.NonVersionable]
public T GetValueOrDefault(T defaultValue) {
    return hasValue ? value : defaultValue;
}

The struct Nullable<int> has an overriden explicit operator as well as method GetValueOrDefault one of these two is used by compiler to convert int? to T.

After that it runs the implicit operator Sample<T>(T value).

A rough picture of what happens is this:

Sample<int> sampleA = (Sample<int>)(int)a;

If we print typeof(T) inside of Sample<T> implicit operator it will display: System.Int32.

In your second scenario compiler doesn't use the implicit operator Sample<T> and simply assigns null to sampleB.

Why does an implicit conversion operator from <T> to <U> accept <T?>?

Solution 1:

Solution 2:

Solution 3:

Related

Recent Posts