Add custom conversion types for string formatting

Is there anyway in python to add additional conversion types to string formatting?

The standard conversion types used in %-based string formatting are things like s for strings, d for decimals, etc. What I'd like to do is add a new character for which I can specify a custom handler (for instance a lambda function) that will return the string to insert.

For instance, I'd like to add h as a conversion type to specify that the string should be escaped for using in HTML. As an example:

#!/usr/bin/python

print "<title>%(TITLE)h</title>" % {"TITLE": "Proof that 12 < 6"}

And this would use cgi.escape on the "TITLE" to produce the following output:

<title>Proof that 12 &lt; 6</title>

Solution 1:

You can create a custom formatter for html templates:

import string, cgi

class Template(string.Formatter):
    def format_field(self, value, spec):
        if spec.endswith('h'):
            value = cgi.escape(value)
            spec = spec[:-1] + 's'
        return super(Template, self).format_field(value, spec)

print Template().format('{0:h} {1:d}', "<hello>", 123)

Note that all conversion takes place inside the template class, no change of input data is required.

Solution 2:

Not with % formatting, no, that is not expandable.

You can specify different formatting options when using the newer format string syntax defined for str.format() and format(). Custom types can implement a __format__() method, and that will be called with the format specification used in the template string:

import cgi

class HTMLEscapedString(unicode):
    def __format__(self, spec):
        value = unicode(self)
        if spec.endswith('h'):
            value = cgi.escape(value)
            spec = spec[:-1] + 's'
        return format(value, spec)

This does require that you use a custom type for your strings:

>>> title = HTMLEscapedString(u'Proof that 12 < 6')
>>> print "<title>{:h}</title>".format(title)
<title>Proof that 12 &lt; 6</title>

For most cases, it is easier just to format the string before handing it to the template, or use a dedicated HTML templating library such as Chameleon, Mako or Jinja2; these handle HTML escaping for you.

Solution 3:

I'm a bit late to the party, but here's what I do, based on an idea in https://mail.python.org/pipermail/python-ideas/2011-March/009426.html

>>> import string, cgi
>>> from xml.sax.saxutils import quoteattr
>>> class MyFormatter(string.Formatter):
    def convert_field(self, value, conversion, _entities={'"': '&quot;'}):
        if 'Q' == conversion:
            return quoteattr(value, _entities)
        else:
            return super(MyFormatter, self).convert_field(value, conversion)

>>> fmt = MyFormatter().format
>>> fmt('{0!Q}', '<hello> "world"')
'"&lt;hello&gt; &quot;world&quot;"'