Are .Net switch statements hashed or indexed?

Does .Net 4 (or any prior version) perform any sort of optimization on longer switch statements based on strings?

I'm working around a potential performance bottleneck due to some long switch statements looking for matching strings in the cases, and I've always assumed these are searched in linear time (or near linear, i.e. not using an index to quickly find the matching string). But this seems like an obvious area that .Net could optimize, so thought I'd check if this is the case or not.

This is a derivative question from my recent one: indexed switch statement, or equivalent? .net, C#


Solution 1:

Compile the following code.

public static int Main(string[] args)
{
    switch (args[0])
    {
        case "x": return 1;
        case "y": return 2;
        case "z": return 3;
    }
    return 0;
}

Now use Reflector or ILDASM to examine the IL the C# compiler generates. Keep adding case statements and decompiling and observe the result.

  • If the number of case statements is small then the compiler emits a sequential equality comparison.
  • If the number of case statements is large then the compiler emits a Dictionary lookup.

I was using the C# 3.0 compiler and I observed that the strategy changes at 7 case statements. I suspect you will see something similiar with C# 4.0 and others.

Update:

I should point that you will see calls to Dictionary.Add in the IL output where it is building up the dictionary for later use. Do not be fooled into thinking this happens everytime. The compiler is actually generating a separate static class and doing an inline static initialization of it. Pay particular attention to the instruction at L_0026. If the class is already initialized then the branch will skip over the Add calls.

L_0021: ldsfld class [mscorlib]System.Collections.Generic.Dictionary`2<string, int32> <PrivateImplementationDetails>{816396DD-F271-4C12-83D0-CC9C9CD67AD6}::$$method0x6000001-1
L_0026: brtrue.s L_0089
L_0028: ldc.i4.7 
L_0029: newobj instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::.ctor(int32)
L_002e: dup 
L_002f: ldstr "x"
L_0034: ldc.i4.0 
L_0035: call instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)
L_003a: dup 
L_003b: ldstr "y"
L_0040: ldc.i4.1 
L_0041: call instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)
L_0046: dup 
L_0047: ldstr "z"
L_004c: ldc.i4.2 
L_004d: call instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)

Also, notice that the dictionary actually contains a map from the original string to an integer. This integer is used to formulate a separate switch in IL.

L_0089: volatile. 
L_008b: ldsfld class [mscorlib]System.Collections.Generic.Dictionary`2<string, int32> <PrivateImplementationDetails>{816396DD-F271-4C12-83D0-CC9C9CD67AD6}::$$method0x6000001-1
L_0090: ldloc.2 
L_0091: ldloca.s CS$0$0002
L_0093: call instance bool [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::TryGetValue(!0, !1&)
L_0098: brfalse.s L_00da
L_009a: ldloc.3 
L_009b: switch (L_00be, L_00c2, L_00c6, L_00ca, L_00ce, L_00d2, L_00d6)
L_00bc: br.s L_00da
L_00be: ldc.i4.1 
L_00bf: stloc.1 
L_00c0: br.s L_00de
L_00c2: ldc.i4.2 
L_00c3: stloc.1 
L_00c4: br.s L_00de
L_00c6: ldc.i4.3 

Update 2:

For what it is worth VB.NET does not seem to have this same optimization for its Select construct.

Solution 2:

Looks like the newer compilers use ComputeStringHash() and then string comparison on a hash hit instead of dictionary construction.

// [19 6 - 19 22]
IL_0037: ldarg.0      // args
IL_0038: ldc.i4.0     
IL_0039: ldelem.ref   
IL_003a: stloc.s      V_5

IL_003c: ldloc.s      V_5
IL_003e: call         unsigned int32 '<PrivateImplementationDetails>'::ComputeStringHash(string)
IL_0043: stloc.s      V_6
IL_0045: ldloc.s      V_6
IL_0047: ldc.i4       -502520314 // 0xe20c2606
IL_004c: bgt.un.s     IL_007b
IL_004e: ldloc.s      V_6
IL_0050: ldc.i4       -536075552 // 0xe00c22e0
IL_0055: beq          IL_00f9
IL_005a: br.s         IL_005c
IL_005c: ldloc.s      V_6
IL_005e: ldc.i4       -519297933 // 0xe10c2473
IL_0063: beq          IL_00e9
IL_0068: br.s         IL_006a
IL_006a: ldloc.s      V_6
IL_006c: ldc.i4       -502520314 // 0xe20c2606
IL_0071: beq          IL_0119
IL_0076: br           IL_014c
IL_007b: ldloc.s      V_6
IL_007d: ldc.i4       -468965076 // 0xe40c292c
IL_0082: bgt.un.s     IL_009d
IL_0084: ldloc.s      V_6
IL_0086: ldc.i4       -485742695 // 0xe30c2799
IL_008b: beq.s        IL_0109
IL_008d: br.s         IL_008f
IL_008f: ldloc.s      V_6
IL_0091: ldc.i4       -468965076 // 0xe40c292c
IL_0096: beq.s        IL_00b6
IL_0098: br           IL_014c
IL_009d: ldloc.s      V_6
IL_009f: ldc.i4       -435409838 // 0xe60c2c52
IL_00a4: beq.s        IL_00d9
IL_00a6: br.s         IL_00a8
IL_00a8: ldloc.s      V_6
IL_00aa: ldc.i4       -418632219 // 0xe70c2de5
IL_00af: beq.s        IL_00c9
IL_00b1: br           IL_014c
IL_00b6: ldloc.s      V_5
IL_00b8: ldstr        "a"
IL_00bd: call         bool [mscorlib]System.String::op_Equality(string, string)
IL_00c2: brtrue.s     IL_0129
IL_00c4: br           IL_014c
IL_00c9: ldloc.s      V_5
IL_00cb: ldstr        "b"
IL_00d0: call         bool [mscorlib]System.String::op_Equality(string, string)
IL_00d5: brtrue.s     IL_012e
IL_00d7: br.s         IL_014c
IL_00d9: ldloc.s      V_5
IL_00db: ldstr        "c"
IL_00e0: call         bool [mscorlib]System.String::op_Equality(string, string)
IL_00e5: brtrue.s     IL_0133
IL_00e7: br.s         IL_014c
IL_00e9: ldloc.s      V_5
IL_00eb: ldstr        "d"
IL_00f0: call         bool [mscorlib]System.String::op_Equality(string, string)
IL_00f5: brtrue.s     IL_0138
IL_00f7: br.s         IL_014c
IL_00f9: ldloc.s      V_5
IL_00fb: ldstr        "e"
IL_0100: call         bool [mscorlib]System.String::op_Equality(string, string)
IL_0105: brtrue.s     IL_013d
IL_0107: br.s         IL_014c
IL_0109: ldloc.s      V_5
IL_010b: ldstr        "f"
IL_0110: call         bool [mscorlib]System.String::op_Equality(string, string)
IL_0115: brtrue.s     IL_0142
IL_0117: br.s         IL_014c
IL_0119: ldloc.s      V_5
IL_011b: ldstr        "g"
IL_0120: call         bool [mscorlib]System.String::op_Equality(string, string)
IL_0125: brtrue.s     IL_0147
IL_0127: br.s         IL_014c

// [21 17 - 21 26]
IL_0129: ldc.i4.0     
IL_012a: stloc.s      V_7
IL_012c: br.s         IL_01ac

// [22 17 - 22 26]
IL_012e: ldc.i4.1     
IL_012f: stloc.s      V_7
IL_0131: br.s         IL_01ac

// [23 17 - 23 26]
IL_0133: ldc.i4.2     
IL_0134: stloc.s      V_7
IL_0136: br.s         IL_01ac

// [24 17 - 24 26]
IL_0138: ldc.i4.3     
IL_0139: stloc.s      V_7
IL_013b: br.s         IL_01ac

// [25 17 - 25 26]
IL_013d: ldc.i4.4     
IL_013e: stloc.s      V_7
IL_0140: br.s         IL_01ac

// [26 17 - 26 26]
IL_0142: ldc.i4.5     
IL_0143: stloc.s      V_7
IL_0145: br.s         IL_01ac

// [27 17 - 27 26]
IL_0147: ldc.i4.6     
IL_0148: stloc.s      V_7
IL_014a: br.s         IL_01ac

// [28 16 - 28 26]
IL_014c: ldc.i4.m1    
IL_014d: stloc.s      V_7
IL_014f: br.s         IL_01ac