Tensorflow: Is it normal that my GPU is using all its Memory but is not under full load?

I am currently trying to run a text-based sequence to sequence model using tensorflow 2.6 and CuDNN.

The code is running, but taking suspiciously long. When I check my Task Manager, I see the following:

Task Manager Screenshot

This looks weird to me, because all memory is taking but it's not under heavy load. Is this expected behaviour?

System:

  • Windows 10
  • Python 3.9.9
  • Tensorflow & Keras 2.6
  • CUDA 11.6
  • CuDNN 8.3
  • NVidia RTX 3080ti

In the code I found the following settings for the GPU

def get_gpu_config():
  gconfig = tf.compat.v1.ConfigProto()
  gconfig.gpu_options.per_process_gpu_memory_fraction = 0.975 # Don't take 100% of the memory
  gconfig.allow_soft_placement = True # Does not aggressively take all the GPU memory
  gconfig.gpu_options.allow_growth = True # Take more memory when necessary
  return gconfig

My python output tells me it found my graphics card:

Python Console

And it is also visible in my nvidia-smi output: nvidia-smi output

Am I maybe missing a configuration? The times it takes are similar to what I got on a CPU system, which seems off to me.

Sidenote:

The Code I try to run had to be migrated from tensorflow-gpu 1.12, but that went "relatively" smooth.


Yes this behaviour is normal for TensorFlow!

From the TensorFlow docs

By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES) visible to the process. This is done to more efficiently use the relatively precious GPU memory resources on the devices by reducing memory fragmentation. To limit TensorFlow to a specific set of GPUs, use the tf.config.set_visible_devices method.


If you don't want TensorFlow to allocate the totality of your VRAM, you can either set a hard limit on how much memory to use or tell TensorFlow to only allocate as much memory as needed.

To set a hard limit

Configure a virtual GPU device as follows:

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only allocate 1GB of memory on the first GPU
  try:
    tf.config.set_logical_device_configuration(
        gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)

Only use as much as needed

  • You can set the environment variable TF_FORCE_GPU_ALLOW_GROWTH=true

OR

  • Use tf.config.experimental.set_memory_growth as follows:
gpus = tf.config.list_physical_devices('GPU')
if gpus:
  try:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)

All code and information here is taken from https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth