How to do unit testing of functions writing files using Python's 'unittest'
I have a Python function that writes an output file to disk.
I want to write a unit test for it using Python's unittest
module.
How should I assert equality of files? I would like to get an error if the file content differs from the expected one + list of differences. As in the output of the Unix diff command.
Is there an official or recommended way of doing that?
I prefer to have output functions explicitly accept a file handle (or file-like object), rather than accept a file name and opening the file themselves. This way, I can pass a StringIO
object to the output function in my unit test, then .read()
the contents back from that StringIO
object (after a .seek(0)
call) and compare with my expected output.
For example, we would transition code like this
##File:lamb.py
import sys
def write_lamb(outfile_path):
with open(outfile_path, 'w') as outfile:
outfile.write("Mary had a little lamb.\n")
if __name__ == '__main__':
write_lamb(sys.argv[1])
##File test_lamb.py
import unittest
import tempfile
import lamb
class LambTests(unittest.TestCase):
def test_lamb_output(self):
outfile_path = tempfile.mkstemp()[1]
try:
lamb.write_lamb(outfile_path)
contents = open(tempfile_path).read()
finally:
# NOTE: To retain the tempfile if the test fails, remove
# the try-finally clauses
os.remove(outfile_path)
self.assertEqual(contents, "Mary had a little lamb.\n")
to code like this
##File:lamb.py
import sys
def write_lamb(outfile):
outfile.write("Mary had a little lamb.\n")
if __name__ == '__main__':
with open(sys.argv[1], 'w') as outfile:
write_lamb(outfile)
##File test_lamb.py
import unittest
from io import StringIO
import lamb
class LambTests(unittest.TestCase):
def test_lamb_output(self):
outfile = StringIO()
# NOTE: Alternatively, for Python 2.6+, you can use
# tempfile.SpooledTemporaryFile, e.g.,
#outfile = tempfile.SpooledTemporaryFile(10 ** 9)
lamb.write_lamb(outfile)
outfile.seek(0)
content = outfile.read()
self.assertEqual(content, "Mary had a little lamb.\n")
This approach has the added benefit of making your output function more flexible if, for instance, you decide you don't want to write to a file, but some other buffer, since it will accept all file-like objects.
Note that using StringIO
assumes the contents of the test output can fit into main memory. For very large output, you can use a temporary file approach (e.g., tempfile.SpooledTemporaryFile).
The simplest thing is to write the output file, then read its contents, read the contents of the gold (expected) file, and compare them with simple string equality. If they are the same, delete the output file. If they are different, raise an assertion.
This way, when the tests are done, every failed test will be represented with an output file, and you can use a third-party tool to diff them against the gold files (Beyond Compare is wonderful for this).
If you really want to provide your own diff output, remember that the Python stdlib has the difflib module. The new unittest support in Python 3.1 includes an assertMultiLineEqual
method that uses it to show diffs, similar to this:
def assertMultiLineEqual(self, first, second, msg=None):
"""Assert that two multi-line strings are equal.
If they aren't, show a nice diff.
"""
self.assertTrue(isinstance(first, str),
'First argument is not a string')
self.assertTrue(isinstance(second, str),
'Second argument is not a string')
if first != second:
message = ''.join(difflib.ndiff(first.splitlines(True),
second.splitlines(True)))
if msg:
message += " : " + msg
self.fail("Multi-line strings are unequal:\n" + message)
import filecmp
Then
self.assertTrue(filecmp.cmp(path1, path2))
I always try to avoid writing files to disk, even if it's a temporary folder dedicated to my tests: not actually touching the disk makes your tests much faster, especially if you interact with files a lot in your code.
Suppose you have this "amazing" piece of software in a file called main.py
:
"""
main.py
"""
def write_to_file(text):
with open("output.txt", "w") as h:
h.write(text)
if __name__ == "__main__":
write_to_file("Every great dream begins with a dreamer.")
To test the write_to_file
method, you can write something like this in a file in the same folder called test_main.py
:
"""
test_main.py
"""
from unittest.mock import patch, mock_open
import main
def test_do_stuff_with_file():
open_mock = mock_open()
with patch("main.open", open_mock, create=True):
main.write_to_file("test-data")
open_mock.assert_called_with("output.txt", "w")
open_mock.return_value.write.assert_called_once_with("test-data")
You could separate the content generation from the file handling. That way, you can test that the content is correct without having to mess around with temporary files and cleaning them up afterward.
If you write a generator method that yields each line of content, then you can have a file handling method that opens a file and calls file.writelines()
with the sequence of lines. The two methods could even be on the same class: test code would call the generator, and production code would call the file handler.
Here's an example that shows all three ways to test. Usually, you would just pick one, depending on what methods are available on the class to test.
import os
from io import StringIO
from unittest.case import TestCase
class Foo(object):
def save_content(self, filename):
with open(filename, 'w') as f:
self.write_content(f)
def write_content(self, f):
f.writelines(self.generate_content())
def generate_content(self):
for i in range(3):
yield u"line {}\n".format(i)
class FooTest(TestCase):
def test_generate(self):
expected_lines = ['line 0\n', 'line 1\n', 'line 2\n']
foo = Foo()
lines = list(foo.generate_content())
self.assertEqual(expected_lines, lines)
def test_write(self):
expected_text = u"""\
line 0
line 1
line 2
"""
f = StringIO()
foo = Foo()
foo.write_content(f)
self.assertEqual(expected_text, f.getvalue())
def test_save(self):
expected_text = u"""\
line 0
line 1
line 2
"""
foo = Foo()
filename = 'foo_test.txt'
try:
foo.save_content(filename)
with open(filename, 'rU') as f:
text = f.read()
finally:
os.remove(filename)
self.assertEqual(expected_text, text)