Python - Working around memory leaks

You can use something like this to help track down memory leaks

>>> from collections import defaultdict
>>> from gc import get_objects
>>> before = defaultdict(int)
>>> after = defaultdict(int)
>>> for i in get_objects():
...     before[type(i)] += 1 
...

now suppose the tests leaks some memory

>>> leaked_things = [[x] for x in range(10)]
>>> for i in get_objects():
...     after[type(i)] += 1
... 
>>> print [(k, after[k] - before[k]) for k in after if after[k] - before[k]]
[(<type 'list'>, 11)]

11 because we have leaked one list containing 10 more lists

Threads would not help. If you must give up on finding the leak, then the only solution to contain its effect is running a new process once in a while (e.g., when a test has left overall memory consumption too high for your liking -- you can determine VM size easily by reading /proc/self/status in Linux, and other similar approaches on other OS's).

Make sure the overall script takes an optional parameter to tell it what test number (or other test identification) to start from, so that when one instance of the script decides it's taking up too much memory, it can tell its successor where to restart from.

Or, more solidly, make sure that as each test is completed its identification is appended to some file with a well-known name. When the program starts it begins by reading that file and thus knows what tests have already been run. This architecture is more solid because it also covers the case where the program crashes during a test; of course, to fully automate recovery from such crashes, you'll want a separate watchdog program and process to be in charge of starting a fresh instance of the test program when it determines the previous one has crashed (it could use subprocess for the purpose -- it also needs a way to tell when the sequence is finished, e.g. a normal exit from the test program could mean that while any crash or exit with a status != 0 signify the need to start a new fresh instance).

If these architectures appeal but you need further help implementing them, just comment to this answer and I'll be happy to supply example code -- I don't want to do it "preemptively" in case there are as-yet-unexpressed issues that make the architectures unsuitable for you. (It might also help to know what platforms you need to run on).

Python - Working around memory leaks

Related

Recent Posts