Pytest where to store expected data

Solution 1:

I had a similar problem once, where I have to test configuration file against an expected file. That's how I fixed it:

  1. Create a folder with the same name of your test module and at the same location. Put all your expected files inside that folder.

    test_foo/
        expected_config_1.ini
        expected_config_2.ini
    test_foo.py
    
  2. Create a fixture responsible for moving the contents of this folder to a temporary file. I did use of tmpdir fixture for this matter.

    from __future__ import unicode_literals
    from distutils import dir_util
    from pytest import fixture
    import os
    
    
    @fixture
    def datadir(tmpdir, request):
        '''
        Fixture responsible for searching a folder with the same name of test
        module and, if available, moving all contents to a temporary directory so
        tests can use them freely.
        '''
        filename = request.module.__file__
        test_dir, _ = os.path.splitext(filename)
    
        if os.path.isdir(test_dir):
            dir_util.copy_tree(test_dir, bytes(tmpdir))
    
        return tmpdir
    

    Important: If you are using Python 3, replace dir_util.copy_tree(test_dir, bytes(tmpdir)) with dir_util.copy_tree(test_dir, str(tmpdir)).

  3. Use your new fixture.

    def test_foo(datadir):
        expected_config_1 = datadir.join('expected_config_1.ini')
        expected_config_2 = datadir.join('expected_config_2.ini')
    

Remember: datadir is just the same as tmpdir fixture, plus the ability of working with your expected files placed into the a folder with the very name of test module.

Solution 2:

If you only have a few tests, then why not include the data as a string literal:

expected_data = """
Your data here...
"""

If you have a handful, or the expected data is really long, I think your use of fixtures makes sense.

However, if you have many, then perhaps a different solution would be better. In fact, for one project I have over one hundred input and expected-output files. So I built my own testing framework (more or less). I used Nose, but PyTest would work as well. I created a test generator which walked the directory of test files. For each input file, a test was yielded which compared the actual output with the expected output (PyTest calls it parametrizing). Then I documented my framework so others could use it. To review and/or edit the tests, you only edit the input and/or expected output files and never need to look at the python test file. To enable different input files to to have different options defined, I also crated a YAML config file for each directory (JSON would work as well to keep the dependencies down). The YAML data consists of a dictionary where each key is the name of the input file and the value is a dictionary of keywords that will get passed to the function being tested along with the input file. If you're interested, here is the source code and documentation. I recently played with the idea of defining the options as Unittests here (requires only the built-in unittest lib) but I'm not sure if I like it.

Solution 3:

I believe pytest-datafiles can be of great help. Unfortunately, it seems not to be maintained much anymore. For the time being, it's working nicely.

Here's a simple example taken from the docs:

import os
import pytest

@pytest.mark.datafiles('/opt/big_files/film1.mp4')
def test_fast_forward(datafiles):
    path = str(datafiles)  # Convert from py.path object to path (str)
    assert len(os.listdir(path)) == 1
    assert os.path.isfile(os.path.join(path, 'film1.mp4'))
    #assert some_operation(os.path.join(path, 'film1.mp4')) == expected_result

    # Using py.path syntax
    assert len(datafiles.listdir()) == 1
    assert (datafiles / 'film1.mp4').check(file=1)

Solution 4:

Think if the whole contents of the config file really needs to be tested.

If only several values or substrings must be checked, prepare an expected template for that config. The tested places will be marked as "variables" with some special syntax. Then prepare a separate expected list of the values for the variables in the template. This expected list can be stored as a separate file or directly in the source code.

Example for the template:

ALLOWED_HOSTS = ['{host}']
DEBUG = {debug}
DEFAULT_FROM_EMAIL = '{email}'

Here, the template variables are placed inside curly braces.

The expected values can look like:

host = www.example.com
debug = False
email = [email protected]

or even as a simple comma-separated list:

www.example.com, False, [email protected]

Then your testing code can produce the expected file from the template by replacing the variables with the expected values. And the expected file is compared with the actual one.

Maintaining the template and expected values separately has and advantage that you can have many testing data sets using the same template.

Testing only variables

An even better approach is that the config generation method produces only needed values for the config file. These values can be easily inserted into the template by another method. But the advantage is that the testing code can directly compare all config variables separately and in clear way.

Templates

While it is easy to replace the variables with needed values in the template, there are ready template libraries, which allow to do it only in one line. Here are just a few examples: Django, Jinja, Mako