Where are generic methods stored?
I've read some information about generics in .ΝΕΤ and noticed one interesting thing.
For example, if I have a generic class:
class Foo<T>
{
public static int Counter;
}
Console.WriteLine(++Foo<int>.Counter); //1
Console.WriteLine(++Foo<string>.Counter); //1
Two classes Foo<int>
and Foo<string>
are different at runtime. But what about case when non-generic class having generic method?
class Foo
{
public void Bar<T>()
{
}
}
It's obvious that there's only one Foo
class. But what about method Bar
? All the generic classes and methods are closed at runtime with parameters they used with. Does it mean that class Foo
has many implementations of Bar
and where the information about this method stored in memory?
Solution 1:
As opposed to C++ templates, .NET generics are evaluated in runtime, not at compile-time. Semantically, if you instantiate the generic class with different type parameters, those will behave as if it were two different classes, but under the hood, there is only one class in the compiled IL (intermediate language) code.
Generic types
The difference between different instantiatons of the same generic type becomes apparent when you use Reflection: typeof(YourClass<int>)
will not be the same as typeof(YourClass<string>)
. These are called constructed generic types. There also exists a typeof(YourClass<>)
which represents the generic type definition. Here are some further tips on dealing with generics via Reflection.
When you instantiate a constructed generic class, the runtime generates a specialized class on the fly. There are subtle differences between how it works with value and reference types.
- The compiler will only generate a single generic type into the assembly.
- The runtime creates a separate version of your generic class for each value type you use it with.
- The runtime allocates a separate set of static fields for each type parameter of the generic class.
- Because reference types have the same size, the runtime can reuse the specialized version it generated the first time you used it with a reference type.
Generic methods
For generic methods, the principles are the same.
- The compiler only generates one generic method, which is the generic method definition.
- In runtime, each different specialization of the method is treated as a different method of the same class.
Solution 2:
First off, let's clarify two things. This is a generic method definition:
T M<T>(T x)
{
return x;
}
This is a generic type definition:
class C<T>
{
}
Most likely, if I ask you what M
is, you'll say that it's a generic method that takes a T
and returns a T
. That's absolutely correct, but I propose a different way of thinking about it -- there are two sets of parameters here. One is the type T
, the other is the object x
. If we combine them, we know that collectively this method takes two parameters in total.
The concept of currying tells us that a function that takes two parameters can be transformed to a function that takes one parameter and returns another function that takes the other parameter (and vice versa). For example, here's a function that takes two integers and produces their sum:
Func<int, int, int> uncurry = (x, y) => x + y;
int sum = uncurry(1, 3);
And here's an equivalent form, where we have a function that takes one integer and produces a function that takes another integer and returns the sum of those aforementioned integers:
Func<int, Func<int, int>> curry = x => y => x + y;
int sum = curry(1)(3);
We went from having one function that takes two integers to having a function that takes an integer and creates functions. Obviously, these two aren't literally the same thing in C#, but they are two different ways of saying the same thing, because passing the same information will eventually get you to the same final result.
Currying allows us to reason about functions easier (it's easier to reason about one parameter than two) and it allow us to know that our conclusions are still relevant for any number of parameters.
Consider for a moment that, on an abstract level, this is what takes place here. Let's say M
is a "super-function" that takes a type T
and returns a regular method. That returned method takes a T
value and returns a T
value.
For example, if we call the super-function M
with the argument int
, we get a regular method from int
to int
:
Func<int, int> e = M<int>;
And if we call that regular method with the argument 5
, we get a 5
back, as we expected:
int v = e(5);
So, consider the following expression:
int v = M<int>(5);
Do you see now why this could be considered as two separate calls? You can recognize the call to the super-function because its arguments are passed in <>
. Then the call to the returned method follows, where the arguments are passed in ()
. It's analogous to the previous example:
curry(1)(3);
And similarly, a generic type definition is also a super-function that takes a type and returns another type. For example, List<int>
is a call to the super-function List
with an argument int
that returns a type that's a list of integers.
Now when the C# compiler meets a regular method, it compiles it as a regular method. It doesn't attempt to create different definitions for different possible arguments. So, this:
int Square(int x) => x * x;
gets compiled as it is. It does not get compiled as:
int Square__0() => 0;
int Square__1() => 1;
int Square__2() => 4;
// and so on
In other words, the C# compiler does not evaluate all possible arguments for this method in order to embed them into the final exacutable -- rather, it leaves the method in its parameterized form and trusts that the result will be evaluated at runtime.
Similarly, when the C# compiler meets a super-function (a generic method or type definition), it compiles it as a super-function. It doesn't attempt to create different definitions for different possible arguments. So, this:
T M<T>(T x) => x;
gets compiled as it is. It does not get compiled as:
int M(int x) => x;
int[] M(int[] x) => x;
int[][] M(int[][] x) => x;
// and so on
float M(float x) => x;
float[] M(float[] x) => x;
float[][] M(float[][] x) => x;
// and so on
Again, the C# compiler trusts that when this super-function is called, it will be evaluated at runtime, and the regular method or type will be produced by that evaluation.
This is one of the reasons why C# is benefitted from having a JIT-compiler as part of its runtime. When a super-function is evaluated, it produces a brand new method or a type that wasn't there at compile time! We call that process reification. Subsequently, the runtime remembers that result so it won't have to re-create it again. That part is called memoization.
Compare with C++ which doesn't require a JIT-compiler as part of its runtime. The C++ compiler actually needs to evaluate the super-functions (called "templates") at compile time. That's a feasible option because the arguments of the super-functions are restricted to things that can be evaluated at compile time.
So, to answer your question:
class Foo
{
public void Bar()
{
}
}
Foo
is a regular type and there's only one of it. Bar
is a regular method inside Foo
and there's only one of it.
class Foo<T>
{
public void Bar()
{
}
}
Foo<T>
is a super-function that creates types at runtime. Each one of those resulting types has its own regular method named Bar
and there's only one of it (for each type).
class Foo
{
public void Bar<T>()
{
}
}
Foo
is a regular type and there's only one of it. Bar<T>
is a super-function that creates regular methods at runtime. Each one of those resulting methods will then be considered part of the regular type Foo
.
class Foo<Τ1>
{
public void Bar<T2>()
{
}
}
Foo<T1>
is a super-function that creates types at runtime. Each one of those resulting types has its own a super-function named Bar<T2>
that creates regular methods at runtime (at a later time). Each one of those resulting methods is considered part of the type that created the corresponding super-function.
The above is the conceptual explanation. Beyond it, certain optimizations can be implemented to reduce the number of distinct implementations in memory -- e.g. two constructed methods can share a single machine-code implementation under certain circumstances. See Luaan's answer about why the CLR can do this and when it actually does it.