Why does the c# compiler emit Activator.CreateInstance when calling new in with a generic type with a new() constraint?

When you have code like the following:

static T GenericConstruct<T>() where T : new()
{
    return new T();
}

The C# compiler insists on emitting a call to Activator.CreateInstance, which is considerably slower than a native constructor.

I have the following workaround:

public static class ParameterlessConstructor<T>
    where T : new()
{
    public static T Create()
    {
        return _func();
    }

    private static Func<T> CreateFunc()
    {
        return Expression.Lambda<Func<T>>( Expression.New( typeof( T ) ) ).Compile();
    }

    private static Func<T> _func = CreateFunc();
}

// Example:
// Foo foo = ParameterlessConstructor<Foo>.Create();

But it doesn't make sense to me why this workaround should be necessary.


I suspect it's a JITting problem. Currently, the JIT reuses the same generated code for all reference type arguments - so a List<string>'s vtable points to the same machine code as that of List<Stream>. That wouldn't work if each new T() call had to be resolved in the JITted code.

Just a guess, but it makes a certain amount of sense.

One interesting little point: in neither case does the parameterless constructor of a value type get called, if there is one (which is vanishingly rare). See my recent blog post for details. I don't know whether there's any way of forcing it in expression trees.


This is likely because it is not clear whether T is a value type or reference type. The creation of these two types in a non-generic scenario produce very different IL. In the face of this ambiguity, C# is forced to use a universal method of type creation. Activator.CreateInstance fits the bill.

Quick experimentation appears to support this idea. If you type in the following code and examine the IL, it will use initobj instead of CreateInstance because there is no ambiguity on the type.

static void Create<T>()
    where T : struct
{
    var x = new T();
    Console.WriteLine(x.ToString());
}

Switching it to a class and new() constraint though still forces an Activator.CreateInstance.


Why is this workaround necessary?

Because the new() generic constraint was added to C# 2.0 in .NET 2.0.

Expression<T> and friends, meanwhile, were added to .NET 3.5.

So your workaround is necessary because it wasn't possible in .NET 2.0. Meanwhile, (1) using Activator.CreateInstance() was possible, and (2) IL lacks a way to implement 'new T()', so Activator.CreateInstance() was used to implement that behavior.


Interesting observation :)

Here is a simpler variation on your solution:

static T Create<T>() where T : new()
{
  Expression<Func<T>> e = () => new T();
  return e.Compile()();
}

Obviously naive (and possible slow) :)


This is a little bit faster, since the expression is only compiled once:

public class Foo<T> where T : new()
{
    static Expression<Func<T>> x = () => new T();
    static Func<T> f = x.Compile();

    public static T build()
    {
        return f();
    }
}

Analyzing the performance, this method is just as fast as the more verbose compiled expression and much, much faster than new T() (160 times faster on my test PC) .

For a tiny bit better performance, the build method call can be eliminated and the functor can be returned instead, which the client could cache and call directly.

public static Func<T> BuildFn { get { return f; } }