What is quicker, switch on string or elseif on type?

performance c#

Greg's profile results are great for the exact scenario he covered, but interestingly, the relative costs of the different methods change dramatically when considering a number of different factors including the number of types being compared, and the relative frequency and any patterns in the underlying data.

The simple answer is that nobody can tell you what the performance difference is going to be in your specific scenario, you will need to measure the performance in different ways yourself in your own system to get an accurate answer.

The If/Else chain is an effective approach for a small number of type comparisons, or if you can reliably predict which few types are going to make up the majority of the ones that you see. The potential problem with the approach is that as the number of types increases, the number of comparisons that must be executed increases as well.

if I execute the following:

int value = 25124;
if(value == 0) ...
else if (value == 1) ...
else if (value == 2) ...
...
else if (value == 25124) ...

each of the previous if conditions must be evaluated before the correct block is entered. On the other hand

switch(value) {
 case 0:...break;
 case 1:...break;
 case 2:...break;
 ...
 case 25124:...break;
}

will perform one simple jump to the correct bit of code.

Where it gets more complicated in your example is that your other method uses a switch on strings rather than integers which gets a little more complicated. At a low level, strings can't be switched on in the same way that integer values can so the C# compiler does some magic to make this work for you.

If the switch statement is "small enough" (where the compiler does what it thinks is best automatically) switching on strings generates code that is the same as an if/else chain.

switch(someString) {
    case "Foo": DoFoo(); break;
    case "Bar": DoBar(); break;
    default: DoOther; break;
}

is the same as:

if(someString == "Foo") {
    DoFoo();
} else if(someString == "Bar") {
    DoBar();
} else {
    DoOther();
}

Once the list of items in the dictionary gets "big enough" the compiler will automatically create an internal dictionary that maps from the strings in the switch to an integer index and then a switch based on that index.

It looks something like this (Just imagine more entries than I am going to bother to type)

A static field is defined in a "hidden" location that is associated with the class containing the switch statement of type Dictionary<string, int> and given a mangled name

//Make sure the dictionary is loaded
if(theDictionary == null) { 
    //This is simplified for clarity, the actual implementation is more complex 
    // in order to ensure thread safety
    theDictionary = new Dictionary<string,int>();
    theDictionary["Foo"] = 0;
    theDictionary["Bar"] = 1;
}

int switchIndex;
if(theDictionary.TryGetValue(someString, out switchIndex)) {
    switch(switchIndex) {
    case 0: DoFoo(); break;
    case 1: DoBar(); break;
    }
} else {
    DoOther();
}

In some quick tests that I just ran, the If/Else method is about 3x as fast as the switch for 3 different types (where the types are randomly distributed). At 25 types the switch is faster by a small margin (16%) at 50 types the switch is more than twice as fast.

If you are going to be switching on a large number of types, I would suggest a 3rd method:

private delegate void NodeHandler(ChildNode node);

static Dictionary<RuntimeTypeHandle, NodeHandler> TypeHandleSwitcher = CreateSwitcher();

private static Dictionary<RuntimeTypeHandle, NodeHandler> CreateSwitcher()
{
    var ret = new Dictionary<RuntimeTypeHandle, NodeHandler>();

    ret[typeof(Bob).TypeHandle] = HandleBob;
    ret[typeof(Jill).TypeHandle] = HandleJill;
    ret[typeof(Marko).TypeHandle] = HandleMarko;

    return ret;
}

void HandleChildNode(ChildNode node)
{
    NodeHandler handler;
    if (TaskHandleSwitcher.TryGetValue(Type.GetRuntimeType(node), out handler))
    {
        handler(node);
    }
    else
    {
        //Unexpected type...
    }
}

This is similar to what Ted Elliot suggested, but the usage of runtime type handles instead of full type objects avoids the overhead of loading the type object through reflection.

Here are some quick timings on my machine:

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 5 types
Method                Time    % of optimal
If/Else               179.67  100.00
TypeHandleDictionary  321.33  178.85
TypeDictionary        377.67  210.20
Switch                492.67  274.21

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 10 types
Method                Time    % of optimal
If/Else               271.33  100.00
TypeHandleDictionary  312.00  114.99
TypeDictionary        374.33  137.96
Switch                490.33  180.71

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 15 types
Method                Time    % of optimal
TypeHandleDictionary  312.00  100.00
If/Else               369.00  118.27
TypeDictionary        371.67  119.12
Switch                491.67  157.59

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 20 types
Method                Time    % of optimal
TypeHandleDictionary  335.33  100.00
TypeDictionary        373.00  111.23
If/Else               462.67  137.97
Switch                490.33  146.22

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 25 types
Method                Time    % of optimal
TypeHandleDictionary  319.33  100.00
TypeDictionary        371.00  116.18
Switch                483.00  151.25
If/Else               562.00  175.99

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 50 types
Method                Time      % of optimal
TypeHandleDictionary  319.67    100.00
TypeDictionary        376.67    117.83
Switch                453.33    141.81
If/Else               1,032.67  323.04

On my machine at least, the type handle dictionary approach beats all of the others for anything over 15 different types when the distribution of the types used as input to the method is random.

If on the other hand, the input is composed entirely of the type that is checked first in the if/else chain that method is much faster:

Testing 3 iterations with 5,000,000 data elements (mode=UniformFirst) and 50 types
Method                Time    % of optimal
If/Else               39.00   100.00
TypeHandleDictionary  317.33  813.68
TypeDictionary        396.00  1,015.38
Switch                403.00  1,033.33

Conversely, if the input is always the last thing in the if/else chain, it has the opposite effect:

Testing 3 iterations with 5,000,000 data elements (mode=UniformLast) and 50 types
Method                Time      % of optimal
TypeHandleDictionary  317.67    100.00
Switch                354.33    111.54
TypeDictionary        377.67    118.89
If/Else               1,907.67  600.52

If you can make some assumptions about your input, you might get the best performance from a hybrid approach where you perform if/else checks for the few types that are most common, and then fall back to a dictionary-driven approach if those fail.

Firstly, you're comparing apples and oranges. You'd first need to compare switch on type vs switch on string, and then if on type vs if on string, and then compare the winners.

Secondly, this is the kind of thing OO was designed for. In languages that support OO, switching on type (of any kind) is a code smell that points to poor design. The solution is to derive from a common base with an abstract or virtual method (or a similar construct, depending on your language)

eg.

class Node
{
    public virtual void Action()
    {
        // Perform default action
    }
}

class Bob : Node
{
    public override void Action()
    {
        // Perform action for Bill
    }
}

class Jill : Node
{
    public override void Action()
    {
        // Perform action for Jill
    }
}

Then, instead of doing the switch statement, you just call childNode.Action()

I just implemented a quick test application and profiled it with ANTS 4.
Spec: .Net 3.5 sp1 in 32bit Windows XP, code built in release mode.

3 million tests:

Switch: 1.842 seconds
If: 0.344 seconds.

Furthermore, the switch statement results reveal (unsurprisingly) that longer names take longer.

1 million tests

Bob: 0.612 seconds.
Jill: 0.835 seconds.
Marko: 1.093 seconds.

I looks like the "If Else" is faster, at least the the scenario I created.

class Program
{
    static void Main( string[] args )
    {
        Bob bob = new Bob();
        Jill jill = new Jill();
        Marko marko = new Marko();

        for( int i = 0; i < 1000000; i++ )
        {
            Test( bob );
            Test( jill );
            Test( marko );
        }
    }

    public static void Test( ChildNode childNode )
    {   
        TestSwitch( childNode );
        TestIfElse( childNode );
    }

    private static void TestIfElse( ChildNode childNode )
    {
        if( childNode is Bob ){}
        else if( childNode is Jill ){}
        else if( childNode is Marko ){}
    }

    private static void TestSwitch( ChildNode childNode )
    {
        switch( childNode.Name )
        {
            case "Bob":
                break;
            case "Jill":
                break;
            case "Marko":
                break;
        }
    }
}

class ChildNode { public string Name { get; set; } }

class Bob : ChildNode { public Bob(){ this.Name = "Bob"; }}

class Jill : ChildNode{public Jill(){this.Name = "Jill";}}

class Marko : ChildNode{public Marko(){this.Name = "Marko";}}

What is quicker, switch on string or elseif on type?

Related

Recent Posts