Python string class like StringBuilder in C#?
Is there some string class in Python like StringBuilder
in C#?
There is no one-to-one correlation. For a really good article please see Efficient String Concatenation in Python:
Building long strings in the Python progamming language can sometimes result in very slow running code. In this article I investigate the computational performance of various string concatenation methods.
TLDR the fastest method is below. It's extremely compact, and also pretty understandable:
def method6():
return ''.join([`num` for num in xrange(loop_count)])
Relying on compiler optimizations is fragile. The benchmarks linked in the accepted answer and numbers given by Antoine-tran are not to be trusted. Andrew Hare makes the mistake of including a call to repr
in his methods. That slows all the methods equally but obscures the real penalty in constructing the string.
Use join
. It's very fast and more robust.
$ ipython3
Python 3.5.1 (default, Mar 2 2016, 03:38:02)
IPython 4.1.2 -- An enhanced Interactive Python.
In [1]: values = [str(num) for num in range(int(1e3))]
In [2]: %%timeit
...: ''.join(values)
...:
100000 loops, best of 3: 7.37 µs per loop
In [3]: %%timeit
...: result = ''
...: for value in values:
...: result += value
...:
10000 loops, best of 3: 82.8 µs per loop
In [4]: import io
In [5]: %%timeit
...: writer = io.StringIO()
...: for value in values:
...: writer.write(value)
...: writer.getvalue()
...:
10000 loops, best of 3: 81.8 µs per loop
I have used the code of Oliver Crow (link given by Andrew Hare) and adapted it a bit to tailor Python 2.7.3. (by using timeit package). I ran on my personal computer, Lenovo T61, 6GB RAM, Debian GNU/Linux 6.0.6 (squeeze).
Here is the result for 10,000 iterations:
method1: 0.0538418292999 secs process size 4800 kb method2: 0.22602891922 secs process size 4960 kb method3: 0.0605459213257 secs process size 4980 kb method4: 0.0544030666351 secs process size 5536 kb method5: 0.0551080703735 secs process size 5272 kb method6: 0.0542731285095 secs process size 5512 kb
and for 5,000,000 iterations (method 2 was ignored because it ran tooo slowly, like forever):
method1: 5.88603997231 secs process size 37976 kb method3: 8.40748500824 secs process size 38024 kb method4: 7.96380496025 secs process size 321968 kb method5: 8.03666186333 secs process size 71720 kb method6: 6.68192911148 secs process size 38240 kb
It is quite obvious that Python guys have done pretty great job to optimize string concatenation, and as Hoare said: "premature optimization is the root of all evil" :-)
Python has several things that fulfill similar purposes:
- One common way to build large strings from pieces is to grow a list of strings and join it when you are done. This is a frequently-used Python idiom.
- To build strings incorporating data with formatting, you would do the formatting separately.
- For insertion and deletion at a character level, you would keep a list of length-one strings. (To make this from a string, you'd call
list(your_string)
. You could also use aUserString.MutableString
for this. -
(c)StringIO.StringIO
is useful for things that would otherwise take a file, but less so for general string building.
Using method 5 from above (The Pseudo File) we can get very good perf and flexibility
from cStringIO import StringIO
class StringBuilder:
_file_str = None
def __init__(self):
self._file_str = StringIO()
def Append(self, str):
self._file_str.write(str)
def __str__(self):
return self._file_str.getvalue()
now using it
sb = StringBuilder()
sb.Append("Hello\n")
sb.Append("World")
print sb