How can I flush GPU memory using CUDA (physical reset is unavailable)
check what is using your GPU memory with
sudo fuser -v /dev/nvidia*
Your output will look something like this:
USER PID ACCESS COMMAND
/dev/nvidia0: root 1256 F...m Xorg
username 2057 F...m compiz
username 2759 F...m chrome
username 2777 F...m chrome
username 20450 F...m python
username 20699 F...m python
Then kill the PID that you no longer need on htop
or with
sudo kill -9 PID.
In the example above, Pycharm was eating a lot of memory so I killed 20450 and 20699.
First type
nvidia-smi
then select the PID that you want to kill
sudo kill -9 PID
Although it should be unecessary to do this in anything other than exceptional circumstances, the recommended way to do this on linux hosts is to unload the nvidia driver by doing
$ rmmod nvidia
with suitable root privileges and then reloading it with
$ modprobe nvidia
If the machine is running X11, you will need to stop this manually beforehand, and restart it afterwards. The driver intialisation processes should eliminate any prior state on the device.
This answer has been assembled from comments and posted as a community wiki to get this question off the unanswered list for the CUDA tag
I also had the same problem, and I saw a good solution in quora, using
sudo kill -9 PID.
see https://www.quora.com/How-do-I-kill-all-the-computer-processes-shown-in-nvidia-smi
for the ones using python:
import torch, gc
gc.collect()
torch.cuda.empty_cache()