Tensorflow : Memory leak even while closing Session?

I was just trying some stuff for a quaternionic neural network when I realized that, even if I close my current Session in a for loop, my program slows down massively and I get a memory leak caused by ops being constructed. This is my code:

for step in xrange(0,200):#num_epochs * train_size // BATCH_SIZE):
338 
339         with tf.Session() as sess:
340 
341             offset = (BATCH_SIZE) % train_size
342             #print "Offset : %d" % offset
343 
344             batch_data = []
345             batch_labels = []
346             batch_data.append(qtrain[0][offset:(offset + BATCH_SIZE)])
347             batch_labels.append(qtrain_labels[0][offset:(offset + BATCH_SIZE)]
352             retour = sess.run(test, feed_dict={x: batch_data})
357 
358             test2 = feedForwardStep(retour, W_to_output,b_output)
367             #sess.close()

The problem seems to come from test2 = feedForward(..). I need to declare these ops after executing retour once, because retour can't be a placeholder (I need to iterate through it). Without this line, the program runs very well, fast and without a memory leak. I can't understand why it seems like TensorFlow is trying to "save" test2 even if I close the session ...


Solution 1:

TL;DR: Closing a session does not free the tf.Graph data structure in your Python program, and if each iteration of the loop adds nodes to the graph, you'll have a leak.

Since your function feedForwardStep creates new TensorFlow operations, and you call it within the for loop, then there is a leak in your code—albeit a subtle one.

Unless you specify otherwise (using a with tf.Graph().as_default(): block), all TensorFlow operations are added to a global default graph. This means that every call to tf.constant(), tf.matmul(), tf.Variable() etc. adds objects to a global data structure. There are two ways to avoid this:

  1. Structure your program so that you build the graph once, then use tf.placeholder() ops to feed in different values in each iteration. You mention in your question that this might not be possible.

  2. Explicitly create a new graph in each for loop. This might be necessary if the structure of the graph depends on the data available in the current iteration. You would do this as follows:

    for step in xrange(200):
        with tf.Graph().as_default(), tf.Session() as sess:
            # Remainder of loop body goes here.
    

    Note that in this version, you cannot use Tensor or Operation objects from a previous iteration. (For example, it's not clear from your code snippet where test comes from.)