Proper indentation for multiline strings?
What is the proper indentation for Python multiline strings within a function?
def method():
string = """line one
line two
line three"""
or
def method():
string = """line one
line two
line three"""
or something else?
It looks kind of weird to have the string hanging outside the function in the first example.
Solution 1:
You probably want to line up with the """
def foo():
string = """line one
line two
line three"""
Since the newlines and spaces are included in the string itself, you will have to postprocess it. If you don't want to do that and you have a whole lot of text, you might want to store it separately in a text file. If a text file does not work well for your application and you don't want to postprocess, I'd probably go with
def foo():
string = ("this is an "
"implicitly joined "
"string")
If you want to postprocess a multiline string to trim out the parts you don't need, you should consider the textwrap
module or the technique for postprocessing docstrings presented in PEP 257:
def trim(docstring):
if not docstring:
return ''
# Convert tabs to spaces (following the normal Python rules)
# and split into a list of lines:
lines = docstring.expandtabs().splitlines()
# Determine minimum indentation (first line doesn't count):
indent = sys.maxint
for line in lines[1:]:
stripped = line.lstrip()
if stripped:
indent = min(indent, len(line) - len(stripped))
# Remove indentation (first line is special):
trimmed = [lines[0].strip()]
if indent < sys.maxint:
for line in lines[1:]:
trimmed.append(line[indent:].rstrip())
# Strip off trailing and leading blank lines:
while trimmed and not trimmed[-1]:
trimmed.pop()
while trimmed and not trimmed[0]:
trimmed.pop(0)
# Return a single string:
return '\n'.join(trimmed)
Solution 2:
The textwrap.dedent
function allows one to start with correct indentation in the source, and then strip it from the text before use.
The trade-off, as noted by some others, is that this is an extra function call on the literal; take this into account when deciding where to place these literals in your code.
import textwrap
def frobnicate(param):
""" Frobnicate the scrognate param.
The Weebly-Ruckford algorithm is employed to frobnicate
the scrognate to within an inch of its life.
"""
prepare_the_comfy_chair(param)
log_message = textwrap.dedent("""\
Prepare to frobnicate:
Here it comes...
Any moment now.
And: Frobnicate!""")
weebly(param, log_message)
ruckford(param)
The trailing \
in the log message literal is to ensure that line break isn't in the literal; that way, the literal doesn't start with a blank line, and instead starts with the next full line.
The return value from textwrap.dedent
is the input string with all common leading whitespace indentation removed on each line of the string. So the above log_message
value will be:
Prepare to frobnicate:
Here it comes...
Any moment now.
And: Frobnicate!
Solution 3:
Use inspect.cleandoc
like so:
def method():
string = inspect.cleandoc("""
line one
line two
line three""")
Relative indentation will be maintained as expected. As commented below, if you want to keep preceding empty lines, use textwrap.dedent
. However that also keeps the first line break.
Note: It's good practice to indent logical blocks of code under its related context to clarify the structure. E.g. the multi-line string belonging to the variable
string
.
Solution 4:
One option which seems to missing from the other answers (only mentioned deep down in a comment by naxa) is the following:
def foo():
string = ("line one\n" # Add \n in the string
"line two" "\n" # Add "\n" after the string
"line three\n")
This will allow proper aligning, join the lines implicitly, and still keep the line shift which, for me, is one of the reasons why I would like to use multiline strings anyway.
It doesn't require any postprocessing, but you need to manually add the \n
at any given place that you want the line to end. Either inline or as a separate string after. The latter is easier to copy-paste in.
Solution 5:
Some more options. In Ipython with pylab enabled, dedent is already in the namespace. I checked and it is from matplotlib. Or it can be imported with:
from matplotlib.cbook import dedent
In documentation it states that it is faster than the textwrap equivalent one and in my tests in ipython it is indeed 3 times faster on average with my quick tests. It also has the benefit that it discards any leading blank lines this allows you to be flexible in how you construct the string:
"""
line 1 of string
line 2 of string
"""
"""\
line 1 of string
line 2 of string
"""
"""line 1 of string
line 2 of string
"""
Using the matplotlib dedent on these three examples will give the same sensible result. The textwrap dedent function will have a leading blank line with 1st example.
Obvious disadvantage is that textwrap is in standard library while matplotlib is external module.
Some tradeoffs here... the dedent functions make your code more readable where the strings get defined, but require processing later to get the string in usable format. In docstrings it is obvious that you should use correct indentation as most uses of the docstring will do the required processing.
When I need a non long string in my code I find the following admittedly ugly code where I let the long string drop out of the enclosing indentation. Definitely fails on "Beautiful is better than ugly.", but one could argue that it is simpler and more explicit than the dedent alternative.
def example():
long_string = '''\
Lorem ipsum dolor sit amet, consectetur adipisicing
elit, sed do eiusmod tempor incididunt ut labore et
dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip.\
'''
return long_string
print example()