Do sealed classes really offer performance Benefits?
I have come across a lot of optimization tips which say that you should mark your classes as sealed to get extra performance benefits.
I ran some tests to check the performance differential and found none. Am I doing something wrong? Am I missing the case where sealed classes will give better results?
Has anyone run tests and seen a difference?
Help me learn :)
Solution 1:
The answer was no, sealed classes do not perform better than non-sealed.
2021: The answer is now yes there are performance benefits to sealing a class.
Sealing a class may not always provide a performance boost, but the dotnet team are adopting the rule of sealing all internal classes to give the optimiser the best chance.
For details you can read https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-6/#peanut-butter
Old answer below.
The issue comes down to the call
vs callvirt
IL op codes. Call
is faster than callvirt
, and callvirt
is mainly used when you don't know if the object has been subclassed. So people assume that if you seal a class all the op codes will change from calvirts
to calls
and will be faster.
Unfortunately callvirt
does other things that make it useful too, like checking for null references. This means that even if a class is sealed, the reference might still be null and thus a callvirt
is needed. You can get around this (without needing to seal the class), but it becomes a bit pointless.
Structs use call
because they cannot be subclassed and are never null.
See this question for more information:
Call and callvirt
Solution 2:
The JITter will sometimes use non-virtual calls to methods in sealed classes since there is no way they can be extended further.
There are complex rules regarding calling type, virtual/nonvirtual, and I don't know them all so I can't really outline them for you, but if you google for sealed classes and virtual methods you might find some articles on the topic.
Note that any kind of performance benefit you would obtain from this level of optimization should be regarded as last-resort, always optimize on the algorithmic level before you optimize on the code-level.
Here's one link mentioning this: Rambling on the sealed keyword
Solution 3:
Update: As of .NET Core 2.0 and .NET Desktop 4.7.1, the CLR now supports devirtualization. It can take methods in sealed classes and replace virtual calls with direct calls - and it can also do this for non-sealed classes if it can figure out it's safe to do so.
In such a case (a sealed class that the CLR couldn't otherwise detect as safe to devirtualise), a sealed class should actually offer some kind of performance benefit.
That said, I wouldn't think it'd be worth worrying about unless you had already profiled the code and determined that you were in a particularly hot path being called millions of times, or something like that:
https://blogs.msdn.microsoft.com/dotnet/2017/06/29/performance-improvements-in-ryujit-in-net-core-and-net-framework/
Original Answer:
I made the following test program, and then decompiled it using Reflector to see what MSIL code was emitted.
public class NormalClass {
public void WriteIt(string x) {
Console.WriteLine("NormalClass");
Console.WriteLine(x);
}
}
public sealed class SealedClass {
public void WriteIt(string x) {
Console.WriteLine("SealedClass");
Console.WriteLine(x);
}
}
public static void CallNormal() {
var n = new NormalClass();
n.WriteIt("a string");
}
public static void CallSealed() {
var n = new SealedClass();
n.WriteIt("a string");
}
In all cases, the C# compiler (Visual studio 2010 in Release build configuration) emits identical MSIL, which is as follows:
L_0000: newobj instance void <NormalClass or SealedClass>::.ctor()
L_0005: stloc.0
L_0006: ldloc.0
L_0007: ldstr "a string"
L_000c: callvirt instance void <NormalClass or SealedClass>::WriteIt(string)
L_0011: ret
The oft-quoted reason that people say sealed provides performance benefits is that the compiler knows the class isn't overriden, and thus can use call
instead of callvirt
as it doesn't have to check for virtuals, etc. As proven above, this is not true.
My next thought was that even though the MSIL is identical, perhaps the JIT compiler treats sealed classes differently?
I ran a release build under the visual studio debugger and viewed the decompiled x86 output. In both cases, the x86 code was identical, with the exception of class names and function memory addresses (which of course must be different). Here it is
// var n = new NormalClass();
00000000 push ebp
00000001 mov ebp,esp
00000003 sub esp,8
00000006 cmp dword ptr ds:[00585314h],0
0000000d je 00000014
0000000f call 70032C33
00000014 xor edx,edx
00000016 mov dword ptr [ebp-4],edx
00000019 mov ecx,588230h
0000001e call FFEEEBC0
00000023 mov dword ptr [ebp-8],eax
00000026 mov ecx,dword ptr [ebp-8]
00000029 call dword ptr ds:[00588260h]
0000002f mov eax,dword ptr [ebp-8]
00000032 mov dword ptr [ebp-4],eax
// n.WriteIt("a string");
00000035 mov edx,dword ptr ds:[033220DCh]
0000003b mov ecx,dword ptr [ebp-4]
0000003e cmp dword ptr [ecx],ecx
00000040 call dword ptr ds:[0058827Ch]
// }
00000046 nop
00000047 mov esp,ebp
00000049 pop ebp
0000004a ret
I then thought perhaps running under the debugger causes it to perform less aggressive optimization?
I then ran a standalone release build executable outside of any debugging environments, and used WinDBG + SOS to break in after the program had completed, and view the dissasembly of the JIT compiled x86 code.
As you can see from the code below, when running outside the debugger the JIT compiler is more aggressive, and it has inlined the WriteIt
method straight into the caller.
The crucial thing however is that it was identical when calling a sealed vs non-sealed class. There is no difference whatsoever between a sealed or nonsealed class.
Here it is when calling a normal class:
Normal JIT generated code
Begin 003c00b0, size 39
003c00b0 55 push ebp
003c00b1 8bec mov ebp,esp
003c00b3 b994391800 mov ecx,183994h (MT: ScratchConsoleApplicationFX4.NormalClass)
003c00b8 e8631fdbff call 00172020 (JitHelp: CORINFO_HELP_NEWSFAST)
003c00bd e80e70106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c00c2 8bc8 mov ecx,eax
003c00c4 8b1530203003 mov edx,dword ptr ds:[3302030h] ("NormalClass")
003c00ca 8b01 mov eax,dword ptr [ecx]
003c00cc 8b403c mov eax,dword ptr [eax+3Ch]
003c00cf ff5010 call dword ptr [eax+10h]
003c00d2 e8f96f106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c00d7 8bc8 mov ecx,eax
003c00d9 8b1534203003 mov edx,dword ptr ds:[3302034h] ("a string")
003c00df 8b01 mov eax,dword ptr [ecx]
003c00e1 8b403c mov eax,dword ptr [eax+3Ch]
003c00e4 ff5010 call dword ptr [eax+10h]
003c00e7 5d pop ebp
003c00e8 c3 ret
Vs a sealed class:
Normal JIT generated code
Begin 003c0100, size 39
003c0100 55 push ebp
003c0101 8bec mov ebp,esp
003c0103 b90c3a1800 mov ecx,183A0Ch (MT: ScratchConsoleApplicationFX4.SealedClass)
003c0108 e8131fdbff call 00172020 (JitHelp: CORINFO_HELP_NEWSFAST)
003c010d e8be6f106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c0112 8bc8 mov ecx,eax
003c0114 8b1538203003 mov edx,dword ptr ds:[3302038h] ("SealedClass")
003c011a 8b01 mov eax,dword ptr [ecx]
003c011c 8b403c mov eax,dword ptr [eax+3Ch]
003c011f ff5010 call dword ptr [eax+10h]
003c0122 e8a96f106f call mscorlib_ni+0x2570d0 (6f4c70d0) (System.Console.get_Out(), mdToken: 060008fd)
003c0127 8bc8 mov ecx,eax
003c0129 8b1534203003 mov edx,dword ptr ds:[3302034h] ("a string")
003c012f 8b01 mov eax,dword ptr [ecx]
003c0131 8b403c mov eax,dword ptr [eax+3Ch]
003c0134 ff5010 call dword ptr [eax+10h]
003c0137 5d pop ebp
003c0138 c3 ret
To me, this provides solid proof that there cannot be any performance improvement between calling methods on sealed vs non-sealed classes... I think I'm happy now :-)
Solution 4:
As I know, there is no guarantee of performance benefit. But there is a chance to decrease performance penalty under some specific condition with sealed method. (sealed class makes all methods to be sealed.)
But it's up to compiler implementation and execution environment.
Details
Many of modern CPUs use long pipeline structure to increase performance. Because CPU is incredibly faster than memory, CPU has to prefetch code from memory to accelerate pipeline. If the code is not ready at proper time, the pipelines will be idle.
There is a big obstacle called dynamic dispatch which disrupts this 'prefetching' optimization. You can understand this as just a conditional branching.
// Value of `v` is unknown,
// and can be resolved only at runtime.
// CPU cannot know which code to prefetch.
// Therefore, just prefetch any one of a() or b().
// This is *speculative execution*.
int v = random();
if (v==1) a();
else b();
CPU cannot prefetch next code to execute in this case because the next code position is unknown until the condition is resolved. So this makes hazard causes pipeline idle. And performance penalty by idle is huge in regular.
Similar thing happen in case of method overriding. Compiler may determine proper method overriding for current method call, but sometimes it's impossible. In this case, proper method can be determined only at runtime. This is also a case of dynamic dispatch, and, a main reason of dynamically-typed languages are generally slower than statically-typed languages.
Some CPU (including recent Intel's x86 chips) uses technique called speculative execution to utilize pipeline even on the situation. Just prefetch one of execution path. But hit rate of this technique is not so high. And speculation failure causes pipeline stall which also makes huge performance penalty. (this is completely by CPU implementation. some mobile CPU is known as does not this kind of optimization to save energy)
Basically, C# is a statically compiled language. But not always. I don't know exact condition and this is entirely up to compiler implementation. Some compilers can eliminate possibility of dynamic dispatch by preventing method overriding if the method is marked as sealed
. Stupid compilers may not.
This is the performance benefit of the sealed
.
This answer (Why is it faster to process a sorted array than an unsorted array?) is describing the branch prediction a lot better.
Solution 5:
<off-topic-rant>
I loathe sealed classes. Even if the performance benefits are astounding (which I doubt), they destroy the object-oriented model by preventing reuse via inheritance. For example, the Thread class is sealed. While I can see that one might want threads to be as efficient as possible, I can also imagine scenarios where being able to subclass Thread would have great benefits. Class authors, if you must seal your classes for "performance" reasons, please provide an interface at the very least so we don't have to wrap-and-replace everywhere that we need a feature you forgot.
Example: SafeThread had to wrap the Thread class because Thread is sealed and there is no IThread interface; SafeThread automatically traps unhandled exceptions on threads, something completely missing from the Thread class. [and no, the unhandled exception events do not pick up unhandled exceptions in secondary threads].
</off-topic-rant>