I'm running Ubuntu and trying to configure QEMU with a GPU pass through. I was following those guides:

https://www.youtube.com/watch?v=w-hOr44oBAI

https://www.pugetsystems.com/labs/articles/Multiheaded-NVIDIA-Gaming-using-Ubuntu-14-04-KVM-585/#Step5Createascripttoruneachvirtualmachine

http://www.howtogeek.com/117635/how-to-install-kvm-and-create-virtual-machines-on-ubuntu/

My /etc/modules:

lp
rtc
pci_stub
vfio
vfio_iommu_type1
vfio_pci
kvm
kvm_intel 

My /etc/default/grub:

GRUB_DEFAULT=0
GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1"
GRUB_CMDLINE_LINUX=""

My GPU:

$ lspci -nn | grep NVIDIA
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK106 [GeForce GTX 650 Ti] [10de:11c6] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GK106 HDMI Audio Controller [10de:0e0b] (rev a1)
$ lspci -nn | grep -i graphic
00:02.0 VGA compatible controller [0300]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller [8086:0152] (rev 09)

My /etc/initramfs-tools/modules:

pci_stub ids=10de:11c6,10de:0e0b

pci_stub seems to be working:

$ dmesg | grep pci-stub
[    0.541737] pci-stub: add 10DE:11C6 sub=FFFFFFFF:FFFFFFFF cls=00000000/00000000
[    0.541750] pci-stub 0000:01:00.0: claimed by stub
[    0.541755] pci-stub: add 10DE:0E0B sub=FFFFFFFF:FFFFFFFF cls=00000000/00000000
[    0.541760] pci-stub 0000:01:00.1: claimed by stub

My /etc/vfio-pci1.cfg:

0000:01:00.0
0000:01:00.1

My ~/windows_start.bash: http://pastebin.com/F7fq2Szt

After I run the bash script, vfio-pci is being used as a driver:

$ lspci -k | grep -C 3 -i nvidia
    Kernel driver in use: ahci
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
    Subsystem: ASRock Incorporation Motherboard
01:00.0 VGA compatible controller: NVIDIA Corporation GK106 [GeForce GTX 650 Ti] (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device 3557
    Kernel driver in use: vfio-pci
01:00.1 Audio device: NVIDIA Corporation GK106 HDMI Audio Controller (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device 3557
    Kernel driver in use: vfio-pci
03:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03)

Software versions:

$ kvm --version
QEMU emulator version 2.5.0, Copyright (c) 2003-2008 Fabrice Bellard

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.3 LTS
Release:    14.04
Codename:   trusty

The problem is that when I run windows_start.bash, QEMU terminal starts, but nothing is happening. The monitor that is connected to NVIDIA GPU is black, it's supposed to be turned on by QEMU, but it doesn't. What am I doing wrong? How can I debug it? What else can I try to achieve GPU pass through?

I checked using this guide and it seems that my GPU doesn't support UEFI, so maybe that's the reason it fails? It's still weird, a lot of people from had success using even older GPUs, so there must be a way.

EDIT: I've just tried to run the vm with libvirt using virt-manager, as suggested by @Deltik. Here's what my config looks like: http://pastebin.com/W46kNcrh

The result was pretty much the same as before - it started, showed a black screen in virt-manager's window and nothing else happened. There were no errors in the debug console (which I started by running virt-manager --debug). I've also tried the same approach on Arch Linux and on a newer version of Ubuntu, it made no difference whatsoever.

I've given the bounty to @Deltik, because he gave me a few good advices, but I still wasn't able to make it work. It seems that this task is impossible to complete, at least with my current hardware.


Solution 1:

You're close.

Using both pci-stub and vfio-pci

It's okay to use pci-stub to reserve a PCI device (like your GPU) to prevent the graphics driver from grabbing it, since the graphics driver (like nouveau or fglrx) will not let go of the device.

In fact, in my test, I needed to claim the PCI graphics card with pci-stub first because vfio-pci wouldn't do so on boot, which is one of the problems you experienced. While unloading one driver (pci-stub) and loading another (vfio-pci) in its place may seem ugly to some, pci-stub and vfio-pci are a reliable tag team that results in a successful virtual machine with GPU passthrough. My testing has not found success using just one or the other.

Now, you need to release the PCI device from pci-stub and hand it to vfio-pci. This part of your script should be doing that already:

configfile=/etc/vfio-pci1.cfg

vfiobind() {
    dev="$1"
        vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
        device=$(cat /sys/bus/pci/devices/$dev/device)
        if [ -e /sys/bus/pci/devices/$dev/driver ]; then
                echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
        fi
        echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id

}

modprobe vfio-pci

cat $configfile | while read line;do
    echo $line | grep ^# >/dev/null 2>&1 && continue
        vfiobind $line
done

Caveat: "vfiobind" is only needed once

As noted in this comment, it is true that the switchover from pci-stub to vfio-pci only needs to be done once after booting. This is true, but it is actually harmless to run the "vfiobind" function multiple times unless a virtual machine is currently using the affected PCI device.

If the virtual machine is using the device, the unbind operation will be blocked ("D state" process). This can be fixed by shutting down or killing the virtual machine, after which the unbind will probably succeed.

You can avoid this unnecessary extra unbinding and rebinding by changing your vfiobind() function to read as follows:

vfiobind() {
        dev="$1"
        vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
        device=$(cat /sys/bus/pci/devices/$dev/device)
        if [ -e /sys/bus/pci/devices/$dev/driver/module/drivers/pci\:vfio-pci ]; then
                echo "Skipping $dev because it is already using the vfio-pci driver"
                continue;
        fi
        if [ -e /sys/bus/pci/devices/$dev/driver ]; then
                echo "Unbinding $dev"
                echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
                echo "Unbound $dev"
        fi
        echo "Plugging $dev into vfio-pci"
        echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id
        echo "Plugged $dev into vfio-pci"
}

Check lspci -k for successful vfio-pci driver attachment

Verify that vfio-pci has taken over using lspci -k. This example is from my equivalent setup:

01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 760] (rev a1)
    Subsystem: eVga.com. Corp. Device 3768
    Kernel driver in use: vfio-pci
01:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)
    Subsystem: eVga.com. Corp. Device 3768
    Kernel driver in use: vfio-pci

If you don't see Kernel driver in use: vfio-pci, something went wrong with the part of your script that I pasted above.

QEMU passthrough configuration

I struggled a bit with the black display.

In an earlier revision of your script, you specified:

-device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \
-device vfio-pci,host=01:00.1,bus=root.1,addr=00.1 \

Try letting QEMU decide what virtual bus and address to use:

-device vfio-pci,host=01:00.0,multifunction=on,x-vga=on \
-device vfio-pci,host=01:00.1 \

You also should pass the -nographic and -vga none flags to qemu-system-x86_64. By default, QEMU reveals an emulated graphics card to the virtual machine, and the virtual machine may use this other video device to display instead of your intended physical NVIDIA card.

If you're still getting a blank display, try adding the -nodefaults flag as well, which excludes the default serial port, parallel port, virtual console, monitor device, VGA adapter, floppy device, and CD-ROM device.

Now, your qemu-system-x86_64 command should be able to start your virtual machine with PCI devices 01:00.0 and 01:00.1 passed through and the display connected to 01:00.0 should be showing something.

Reference / Sample Configuration

My test isn't identical to yours, but I was able to get working graphics passthrough and USB passthrough with this qemu-system-x86_64 command, after claiming all the relevant PCI devices from pci-stub with vfio-pci:

qemu-system-x86_64 \
-enable-kvm \
-name node51-Win10 \
-S \
-machine pc-i440fx-2.1,accel=kvm,usb=off \
-cpu host,kvm=off \
-m 16384 \
-realtime mlock=off \
-smp 8,sockets=8,cores=1,threads=1 \
-uuid 5c4a3e8a-6e8e-449f-9361-29fcdc35358d \
-nographic \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/node51-Win10.monitor,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=localtime,driftfix=slew \
-global kvm-pit.lost_tick_policy=discard \
-no-hpet \
-no-shutdown \
-global PIIX4_PM.disable_s3=0 \
-global PIIX4_PM.disable_s4=0 \
-boot strict=on \
-device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 \
-device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 \
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 \
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \
-drive file=/dev/zd16,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-drive file=/media/isos/Win10_English_x64.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw \
-device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \
-device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 \
-netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:11:bf:dd,bus=pci.0,addr=0x3 \
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=serial0 \
-device usb-tablet,id=input0 \
-device intel-hda,id=sound0,bus=pci.0,addr=0x4 \
-device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \
-device vfio-pci,host=01:00.1,id=hostdev0,bus=pci.0,addr=0x9 \
-device vfio-pci,host=00:12.0,id=hostdev1,bus=pci.0,addr=0x8 \
-device vfio-pci,host=00:12.2,id=hostdev2,bus=pci.0,addr=0xa \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 \
-device vfio-pci,host=01:00.0,x-vga=on \
-vga none \
-msg timestamp=on

Relevant items from lspci -k:

00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
    Subsystem: Gigabyte Technology Co., Ltd Device 5004
    Kernel driver in use: vfio-pci
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
    Subsystem: Gigabyte Technology Co., Ltd Device 5004
    Kernel driver in use: vfio-pci
01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 760] (rev a1)
    Subsystem: eVga.com. Corp. Device 3768
    Kernel driver in use: vfio-pci
01:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)
    Subsystem: eVga.com. Corp. Device 3768
    Kernel driver in use: vfio-pci

Observed result:

Photograph of the observed result