Blacklist a Nvidia gpu for qemu/kvm passthrough

I have run into similar problems like you (Lubuntu 16.04). This comes due to drivers/modules binding the devices to them before pci-stub is able to do this. You have at least two options in here:

The first and easiest one would be to blacklist the modules that claim the device. Type in lspci -knn | grep VGA -A 5 to see all your VGA pci devices and their device-number and their kernel modules.

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:128b] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:8c93]
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] GK208 HDMI/DP Audio Controller [1462:8c93]
--
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 [GeForce GTX 970] [19da:1366]
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau
02:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 High Definition Audio Controller [19da:1366]

Now you need to check which driver is in use. For example nouveau grabbed my VGA-device 02:00.0 which i want to use for my VM, so I blacklist that one in:

sudo nano /etc/modprobe.d/blacklist.conf blacklist nouveau

and your are done.

In my case this would cause a problem since I have two nVidia VGA's installed (01:00.0 and 02:00.0) both running with the same driver. In my case I do not blacklist the driver.

I do manually unbind nouveau from my 02:00.0 VGA card, since i wanted to use that card for my VM-guest and the 01:00.0 VGA for my Linux host. Thanks to this guide i found out how to do so: https://lwn.net/Articles/143397/

Type in sudo tree /sys/bus/pci/drivers/nouveau. Exchange nouveau with whatever module grabbed your device.

You should recieve a list like this:

/sys/bus/pci/drivers/nouveau
├── 0000:01:00.0 -> ../../../../devices/pci0000:00/0000:00:03.0/0000:01:00.0
├── 0000:02:00.0 -> ../../../../devices/pci0000:00/0000:00:05.0/0000:02:00.0
├── bind
├── module -> ../../../../module/drm
├── new_id
├── remove_id
├── uevent
└── unbind

We see that driver nouveau has to devices binding to it: 0000:01:00.0 and 0000:02:00.0

To unbind and bind my graphic-card I first need to turn off lightdm.service. Therefor I open the console outside of the desktop environment with CTRL+ALT+F2 for example. Login as root and type systemctl stop lightdm.service

Now I can unbind the module from the graphics-card:

echo -n "0000:02:00.0" > /sys/bus/pci/drivers/nouveau/unbind

and bind it to whatever module I want (pci-stub or vfio-pci). I used vfio-pci.

echo -n "0000:02:00.0" > /sys/bus/pci/drivers/vfio-pci/bind

After that, you can start your desktopmanager again: systemctl start lightdm.service

If everything worked you should find your device binded to the module you specified by looking up with lspci -knn | grep VGA -A 5 again.

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:128b] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:8c93]
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] GK208 HDMI/DP Audio Controller [1462:8c93]
--
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 [GeForce GTX 970] [19da:1366]
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau
02:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 High Definition Audio Controller [19da:1366]

Unfortunately this workaround loses effect after reboot. Yet i did not find out on how to make it persistent. Maybe anybody else can give me a hint. Something like a startscript would be possible, i guess. But it would be better beeing able to bind the device to a specific module without having to unbind it first. Imagine i would like to use the nvidia driver one day. In that case unbinding from nouveau would be useless since the graphics card would be bind to the nvidia module.


I'm setting up qemu-kvm passthrough as well, and i had the same problem as you. I'm using my integrated intel graphics card as my primary gpu, so i opened the nvidia settings and disabled hybrid graphics, so the nvidia card won't be used: (pic related)

After that i had no problem binding the card to vfio-pci.

It is possible that somehow the nvidia modules will cause you trouble when starting qemu, or that you don't have the option to turn off hybrid graphics. If this is the case, you can also try what i also did, and manually disable the nvidia modules using a script like this one from console mode (CTRL+ALT+F1):

#!/bin/bash
sudo service lightdm stop
sudo rmmod nvidia_uvm
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia
sudo service lightdm start

This stops the display manager (in my case lightdm), disables the nvidia modules in order, and restarts the display manager afterwards. Make sure to launch this in console mode, as running this from the desktop will most likely interrupt the script after the first line.

The nvidia modules will automatically load again when you reboot, but you can also load them again manually with:

modprobe nvidia nvidia_modeset nvidia_drm nvidia_uvm

Hope this helps.


Deactivate nvidia/nuveau using grub config.

There is the possibility to pass the module_blacklist=<module1>[,<module2>] (documentation) directive to the kernel via the grub2 command line. i was able to deactivate the nuveau and nvidia driver with the following addition to the GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub (don't forget to issue sudo update-grub):

module_blacklist=nvidia,nvidia_uvm,nvidia_drm,nvidia_modeset,nouveau

There is also the possibility to automatically generate grub entries with and without this option for each kernel: https://unix.stackexchange.com/questions/24670/choose-at-grub-menu-whether-nvidia-driver-should-be-used (first answer). But it turned out to be more cumbersome than expected. The ubuntu grub config is very complicated. Make sure to make a backup before tinkering with it.

This is especially helpful if you want to use a powerful NVIDIA card for gaming in a virtual machine using VGA Passthrough, yet have the option to use it for deep learning, such as tensorflow. Only a reboot required to switch between those two.