What does "del" do exactly?

Solution 1:

Python is a garbage-collected language. If a value isn't "reachable" from your code anymore, it will eventually get deleted.

The del statement, as you saw, removes the binding of your variable. Variables aren't values, they're just names for values.

If that variable was the only reference to the value anywhere, the value will eventually get deleted. In CPython in particular, the garbage collector is built on top of reference counting. So, that "eventually" means "immediately".* In other implementations, it's usually "pretty soon".

If there were other references to the same value, however, just removing one of those references (whether by del x, x = None, exiting the scope where x existed, etc.) doesn't clean anything up.**


There's another issue here. I don't know what the memory_profiler module (presumably this one) actually measures, but the description (talking about use of psutil) sounds like it's measuring your memory usage from "outside".

When Python frees up storage, it doesn't always—or even usually—return it to the operating system. It keeps "free lists" around at multiple levels so it can re-use the memory more quickly than if it had to go all the way back to the OS to ask for more. On modern systems, this is rarely a problem—if you need the storage again, it's good that you had it; if you don't, it'll get paged out as soon as someone else needs it and never get paged back in, so there's little harm.

(On top of that, which I referred to as "the OS" above is really an abstraction made up of multiple levels, from the malloc library through the core C library to the kernel/pager, and at least one of those levels usually has its own free lists.)

If you want to trace memory use from the inside perspective… well, that's pretty hard. It gets a lot easier in Python 3.4 thanks to the new tracemalloc module. There are various third-party modules (e.g., heapy/guppy, Pympler, meliae) that try to get the same kind of information with earlier versions, but it's difficult, because getting information from the various allocators, and tying that information to the garbage collector, was very hard before PEP 445.


* In some cases, there are references to the value… but only from other references that are themselves unreachable, possibly in a cycle. That still counts as "unreachable" as far as the garbage collector is concerned, but not as far as reference counts are concerned. So, CPython also has a "cycle detector" that runs every so often and finds cycles of mutually-reachable but not-reachable-from-anyone-else values and cleans them up.

** If you're testing in the interactive console, there may be hidden references to your values that are hard to track, so you might think you've gotten rid of the last reference when you haven't. In a script, it should always be possible, if not easy, to figure things out. The gc module can help, as can the debugger. But of course both of them also give you new ways to add additional hidden references.