str performance in python
While profiling a piece of python code (python 2.6
up to 3.2
), I discovered that the
str
method to convert an object (in my case an integer) to a string is almost an order of magnitude slower than using string formatting.
Here is the benchmark
>>> from timeit import Timer
>>> Timer('str(100000)').timeit()
0.3145311339386332
>>> Timer('"%s"%100000').timeit()
0.03803517023435887
Does anyone know why this is the case? Am I missing something?
Solution 1:
'%s' % 100000
is evaluated by the compiler and is equivalent to a constant at run-time.
>>> import dis
>>> dis.dis(lambda: str(100000))
8 0 LOAD_GLOBAL 0 (str)
3 LOAD_CONST 1 (100000)
6 CALL_FUNCTION 1
9 RETURN_VALUE
>>> dis.dis(lambda: '%s' % 100000)
9 0 LOAD_CONST 3 ('100000')
3 RETURN_VALUE
%
with a run-time expression is not (significantly) faster than str
:
>>> Timer('str(x)', 'x=100').timeit()
0.25641703605651855
>>> Timer('"%s" % x', 'x=100').timeit()
0.2169809341430664
Do note that str
is still slightly slower, as @DietrichEpp said, this is because str
involves lookup and function call operations, while %
compiles to a single immediate bytecode:
>>> dis.dis(lambda x: str(x))
9 0 LOAD_GLOBAL 0 (str)
3 LOAD_FAST 0 (x)
6 CALL_FUNCTION 1
9 RETURN_VALUE
>>> dis.dis(lambda x: '%s' % x)
10 0 LOAD_CONST 1 ('%s')
3 LOAD_FAST 0 (x)
6 BINARY_MODULO
7 RETURN_VALUE
Of course the above is true for the system I tested on (CPython 2.7); other implementations may differ.
Solution 2:
One reason that comes to mind is the fact that str(100000)
involves a global lookup, but "%s"%100000
does not. The str
global has to be looked up in the global scope. This does not account for the entire difference:
>>> Timer('str(100000)').timeit()
0.2941889762878418
>>> Timer('x(100000)', 'x=str').timeit()
0.24904918670654297
As noted by thg435,
>>> Timer('"%s"%100000',).timeit()
0.034214019775390625
>>> Timer('"%s"%x','x=100000').timeit()
0.2940788269042969