Performance: type derived from generic

I've encountered with one performance problem that I can't quite understand. I know how to fix it but I don't understand Why that happens. It's just for fun!
Let's talk code. I simplified the code as much as I could to reproduce the issue.
Suppose we have a generic class. It has an empty list inside and does something with T in constructor. It has Run method that calls an IEnumerable<T> method on the list, e.g. Any().

public class BaseClass<T>
{
    private List<T> _list = new List<T>();

    public BaseClass()
    {
        Enumerable.Empty<T>();
        // or Enumerable.Repeat(new T(), 10);
        // or even new T();
        // or foreach (var item in _list) {}
    }

    public void Run()
    {
        for (var i = 0; i < 8000000; i++)
        {
            if (_list.Any())
            // or if (_list.Count() > 0)
            // or if (_list.FirstOrDefault() != null)
            // or if (_list.SingleOrDefault() != null)
            // or other IEnumerable<T> method
            {
                return;
            }
        }
    }
}

Then we have a derived class which is empty:

public class DerivedClass : BaseClass<object>
{
}

Let's measure the performance of running ClassBase<T>.Run method from both classes. Accessing from derived type is 4 times slower that from base class. And I can't understand why that happens. Compiled in Release mode, result is the same with warm up. It happens on .NET 4.5 only.

public class Program
{
    public static void Main()
    {
        Measure(new DerivedClass());
        Measure(new BaseClass<object>());
    }

    private static void Measure(BaseClass<object> baseClass)
    {
        var sw = Stopwatch.StartNew();
        baseClass.Run();
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);
    }
}

Full listing on gist


Solution 1:

Update:
There's an answer from the CLR team on Microsoft Connect

It is related to dictionary lookups in shared generics code. The heuristic in runtime and JIT do not work well for this particular test. We will take a look what can be done about it.

In the meantime, you can workaround it by adding two dummy methods to the BaseClass (do not even need to be called). It will cause the heuristic to work as one would expect.

Original:
That's JIT fail.

Can be fixed by this crazy thing:

    public class BaseClass<T>
    {
        private List<T> _list = new List<T>();

        public BaseClass()
        {
            Enumerable.Empty<T>();
            // or Enumerable.Repeat(new T(), 10);
            // or even new T();
            // or foreach (var item in _list) {}
        }

        public void Run()
        {
            for (var i = 0; i < 8000000; i++)
            {
                if (_list.Any())
                {
                    return;
                }
            }
        }

        public void Run2()
        {
            for (var i = 0; i < 8000000; i++)
            {
                if (_list.Any())
                {
                    return;
                }
            }
        }

        public void Run3()
        {
            for (var i = 0; i < 8000000; i++)
            {
                if (_list.Any())
                {
                    return;
                }
            }
        }
    }

Note that Run2()/Run3() are not called from anywhere. But if you comment out Run2 or Run3 methods - you'll get performance penalty as before.

There's something related to stack alignment or to the size of method table, I guess.

P.S. You can replace

 Enumerable.Empty<T>();
 // with
 var x = new Func<IEnumerable<T>>(Enumerable.Empty<T>);

still the same bug.