I have a long-running script which, if let to run long enough, will consume all the memory on my system.

Without going into details about the script, I have two questions:

  1. Are there any "Best Practices" to follow, which will help prevent leaks from occurring?
  2. What techniques are there to debug memory leaks in Python?

Have a look at this article: Tracing python memory leaks

Also, note that the garbage collection module actually can have debug flags set. Look at the set_debug function. Additionally, look at this code by Gnibbler for determining the types of objects that have been created after a call.


I tried out most options mentioned previously but found this small and intuitive package to be the best: pympler

It's quite straight forward to trace objects that were not garbage-collected, check this small example:

install package via pip install pympler

from pympler.tracker import SummaryTracker
tracker = SummaryTracker()

# ... some code you want to investigate ...

tracker.print_diff()

The output shows you all the objects that have been added, plus the memory they consumed.

Sample output:

                                 types |   # objects |   total size
====================================== | =========== | ============
                                  list |        1095 |    160.78 KB
                                   str |        1093 |     66.33 KB
                                   int |         120 |      2.81 KB
                                  dict |           3 |       840 B
      frame (codename: create_summary) |           1 |       560 B
          frame (codename: print_diff) |           1 |       480 B

This package provides a number of more features. Check pympler's documentation, in particular the section Identifying memory leaks.


Let me recommend mem_top tool I created

It helped me to solve a similar issue

It just instantly shows top suspects for memory leaks in a Python program


Tracemalloc module was integrated as a built-in module starting from Python 3.4, and appearently, it's also available for prior versions of Python as a third-party library (haven't tested it though).

This module is able to output the precise files and lines that allocated the most memory. IMHO, this information is infinitly more valuable than the number of allocated instances for each type (which ends up being a lot of tuples 99% of the time, which is a clue, but barely helps in most cases).

I recommend you use tracemalloc in combination with pyrasite. 9 times out of 10, running the top 10 snippet in a pyrasite-shell will give you enough information and hints to to fix the leak within 10 minutes. Yet, if you're still unable to find the leak cause, pyrasite-shell in combination with the other tools mentioned in this thread will probably give you some more hints too. You should also take a look on all the extra helpers provided by pyrasite (such as the memory viewer).