advanced string formatting vs template strings
Templates are meant to be simpler than the the usual string formatting, at the cost of expressiveness. The rationale of PEP 292 compares templates to Python's %
-style string formatting:
Python currently supports a string substitution syntax based on C's
printf()
'%' formatting character. While quite rich, %-formatting codes are also error prone, even for experienced Python programmers. A common mistake is to leave off the trailing format character, e.g. thes
in%(name)s
.In addition, the rules for what can follow a % sign are fairly complex, while the usual application rarely needs such complexity. Most scripts need to do some string interpolation, but most of those use simple "stringification" formats, i.e.
%s
or%(name)s
This form should be made simpler and less error prone.
While the new .format()
improved the situation, it's still true that the format string syntax is rather complex, so the rationale still has its points.
One key advantage of string templates is that you can substitute only some of the placeholders using the safe_substitute
method. Normal format strings will raise an error if a placeholder is not passed a value. For example:
"Hello, {first} {last}".format(first='Joe')
raises:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'last'
But:
from string import Template
Template("Hello, $first $last").safe_substitute(first='Joe')
Produces:
'Hello, Joe $last'
Note that the returned value is a string, not a Template
; if you want to substitute the $last
you'll need to create a new Template
object from that string.
For what it's worth, Template substitution from a dict appears to be 4 to 10 times slower than format substitution, depending on the length of the template. Here's a quick comparison I ran under OS X on a 2.3 GHz core i7 with Python 3.5.
from string import Template
lorem = "Lorem ipsum dolor sit amet {GIBBERISH}, consectetur adipiscing elit {DRIVEL}. Expectoque quid ad id, quod quaerebam, respondeas."
loremtpl = Template("Lorem ipsum dolor sit amet $GIBBERISH, consectetur adipiscing elit $DRIVEL. Expectoque quid ad id, quod quaerebam, respondeas.")
d = dict(GIBBERISH='FOOBAR', DRIVEL = 'RAXOOP')
In [29]: timeit lorem.format(**d)
1.07 µs ± 2.13 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [30]: timeit loremtpl.substitute(d)
8.74 µs ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The worst case I tested was about 10 times slower for a 13 character string. The best case I tested was about 4 times slower for a 71000 character string.
Its primarily a matter of syntax preference, which usually boils down to a laziness/verbosity tradeoff and familiarity/habits with existing string template systems. In this case template strings are more lazy/simple/quick to write, while .format()
is more verbose and feature-full.
Programmers used to the PHP language or the Jinja family of template systems may prefer template strings. Using "%s" positional style tuple substitution might appeal to those who use printf
-like string formatting or want something quick. .format()
has a few more features, but unless you need something specific that only .format()
provides, there is nothing wrong with using any existing scheme.
The only thing to be aware of is that named string templates are more flexible and require less maintenance than order-dependent ones. Other than that it all comes down to either personal preference or the coding standard of the project you are working on;