Most efficient way of making an if-elif-elif-else statement when the else is done the most?

The code...

options.get(something, doThisMostOfTheTime)()

...looks like it ought to be faster, but it's actually slower than the if ... elif ... else construct, because it has to call a function, which can be a significant performance overhead in a tight loop.

Consider these examples...

1.py

something = 'something'

for i in xrange(1000000):
    if something == 'this':
        the_thing = 1
    elif something == 'that':
        the_thing = 2
    elif something == 'there':
        the_thing = 3
    else:
        the_thing = 4

2.py

something = 'something'
options = {'this': 1, 'that': 2, 'there': 3}

for i in xrange(1000000):
    the_thing = options.get(something, 4)

3.py

something = 'something'
options = {'this': 1, 'that': 2, 'there': 3}

for i in xrange(1000000):
    if something in options:
        the_thing = options[something]
    else:
        the_thing = 4

4.py

from collections import defaultdict

something = 'something'
options = defaultdict(lambda: 4, {'this': 1, 'that': 2, 'there': 3})

for i in xrange(1000000):
    the_thing = options[something]

...and note the amount of CPU time they use...

1.py: 160ms
2.py: 170ms
3.py: 110ms
4.py: 100ms

...using the user time from time(1).

Option #4 does have the additional memory overhead of adding a new item for every distinct key miss, so if you're expecting an unbounded number of distinct key misses, I'd go with option #3, which is still a significant improvement on the original construct.

I'd create a dictionary :

options = {'this': doThis,'that' :doThat, 'there':doThere}

Now use just:

options.get(something, doThisMostOfTheTime)()

If something is not found in the options dict then dict.get will return the default value doThisMostOfTheTime

Some timing comparisons:

Script:

from random import shuffle
def doThis():pass
def doThat():pass
def doThere():pass
def doSomethingElse():pass
options = {'this':doThis, 'that':doThat, 'there':doThere}
lis = range(10**4) + options.keys()*100
shuffle(lis)

def get():
    for x in lis:
        options.get(x, doSomethingElse)()

def key_in_dic():
    for x in lis:
        if x in options:
            options[x]()
        else:
            doSomethingElse()

def if_else():
    for x in lis:
        if x == 'this':
            doThis()
        elif x == 'that':
            doThat()
        elif x == 'there':
            doThere()
        else:
            doSomethingElse()

Results:

>>> from so import *
>>> %timeit get()
100 loops, best of 3: 5.06 ms per loop
>>> %timeit key_in_dic()
100 loops, best of 3: 3.55 ms per loop
>>> %timeit if_else()
100 loops, best of 3: 6.42 ms per loop

For 10**5 non-existent keys and 100 valid keys::

>>> %timeit get()
10 loops, best of 3: 84.4 ms per loop
>>> %timeit key_in_dic()
10 loops, best of 3: 50.4 ms per loop
>>> %timeit if_else()
10 loops, best of 3: 104 ms per loop

So, for a normal dictionary checking for the key using key in options is the most efficient way here:

if key in options:
   options[key]()
else:
   doSomethingElse()

Are you able to use pypy?

Keeping your original code but running it on pypy gives a 50x speed-up for me.

CPython:

matt$ python
Python 2.6.8 (unknown, Nov 26 2012, 10:25:03)
[GCC 4.2.1 Compatible Apple Clang 3.0 (tags/Apple/clang-211.12)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> from timeit import timeit
>>> timeit("""
... if something == 'this': pass
... elif something == 'that': pass
... elif something == 'there': pass
... else: pass
... """, "something='foo'", number=10000000)
1.728302001953125

Pypy:

matt$ pypy
Python 2.7.3 (daf4a1b651e0, Dec 07 2012, 23:00:16)
[PyPy 2.0.0-beta1 with GCC 4.2.1] on darwin
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``a 10th of forever is 1h45''
>>>>
>>>> from timeit import timeit
>>>> timeit("""
.... if something == 'this': pass
.... elif something == 'that': pass
.... elif something == 'there': pass
.... else: pass
.... """, "something='foo'", number=10000000)
0.03306388854980469

Most efficient way of making an if-elif-elif-else statement when the else is done the most?

Related

Recent Posts