Single GPU Passthrough on Linux

Tested on Fedora 28 and Arch Linux

Currently working on 10 Series (Pascal) Nvidia GPUs

Now working with 5 through 10 Seires cards with manual hex editing

Pull Requests and Issues are welcome

Special Thanks to:

The Passthrough post (https://passthroughpo.st)

For hosting news and information about VFIO passthrough, and for the libvirt/qemu hook helper in this guide.

andre-ritcher

For providing the vfio-pci-bind tool. A tool that is no longer used in this guide, but was previously used and he still deserves thanks.

Matoking

For the Nvidia ROM patcher. Making passing the boot gpu to the VM without GPU bios problems.

Sporif

For diagnosing, developing, and testing methods to successfully rebind the EFI-Framebuffer when passing the video card back to the host OS.

droidman

For instructions on manually editing the vBIOS hex for use with VFIO passthrough

So many other people and organizations I need to thank. If feel your name should be here, please contact me. Credit where credit is due is very important to me, and to making the Linux community a better place.

Disclaimer
Background
- Advantages
- Disadvantages
Prerequisites and Assumptions
Procedure

Disclaimer

I have no qualifications as a Linux admin/user/professional/intelligent person. I am a windows systems administrator. GNU/Linux is a hobby and passion. I have very little programming experience and I am very bad at all types of scripting. I have hacked together a lot of work that other people have done and put it in one place.

Background

Historically, VFIO passthrough has been built on a very specific model. I.E.

2 GPUs, 1 for the host, and one for the VM
2 monitors OR a monitor with 2 inputs
- or a KVM switch

I personally, as well as some of you out there, might not have those things available. Maybe You've got a Mini-ITX build with no iGPU. Or maybe you're poor like me, and can't shell out for new computer components without some financial planning before hand.

Whatever your reason is. VFIO is still possible. But with caveats. Here's some advantages and disadvantages of this model.

This setup model is a lot like dual booting, without actually rebooting.

Advantages

As already stated, this model only requires one GPU
The ability to switch back and forth between different OSes with FULL use of a discrete graphics processor (Linux on Host with full GPU, Windows 10 Guest with Full GPU, MacOS guest with full GPU)
Bragging rights
Could be faster than dual booting (this depends on your system)
Using virtual disk images (like qcow) gives you management of snapshots, making breaking your guest os easy to recover from.

Disadvantages

Can only use one OS at a time.
- Once the VM is running, it's basically like running that as your main OS. You will be logged out of your user on the host, but will be unable to manage the host locally at all. You can still use ssh/vnc/xrdp to manage the host.
There are still some quirks (I need your help to iron these out!)
Using virtual disk images could be a performance hit
- You can still use raw partitions/lvm/pass through raw disks, but loose the more robust snapshot and management features
If you DO have a second video card, solutions like looking-glass are WAYYY more convenient and need active testing and development.
All VMs must be run as root. There are security considerations to be made there. This model requires a level of risk acceptance.

For my personal use case. This model is worth it to me and it might be for you too!

Prerequisites and Assumptions

Assumptions

This guide is going to assume a few things

You have a system capable of VFIO passthrough. I.E. a processors that supports IOMMU, sane IOMMU groups, and etc.
Unfortunately for the time being, a 10 Series Nvidia GPU. the VFIO ROM patcher we will be using only works with these specifically.
I am going to start in a place where you have a working libvirt config, or qemu script, that boots a guest OS without PCI devices passed through.

I am not going to cover the basic setup of VFIO passthrough here. There are a lot of guides out there that cover the process from beginning to end.

What I will say is that using the Arch Wiki is your best bet.

Follow the instructions found here: https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF

Skip the Isolating the GPU section We are not going to do that in this method as we still want the host to have access to it. I will cover this again in the procedure section.

Prerequisites

A working Libvirt VM or Qemu script for your guest OS.
IOMMU enabled and Sane IOMMU groups
The Following Tools
- A hex editor
- (Optional/Only with 10 Series cards) Nvidia ROM Patcher: https://github.com/Matoking/NVIDIA-vBIOS-VFIO-Patcher
- (Optional) nvflash for dumping your GPU bios: https://www.techpowerup.com/download/nvidia-nvflash/
  - Techpowerup also has a database of roms for your corresponding video card model
- (If using Libvirt) The Libvirt Hook Helper https://passthroughpo.st/simple-per-vm-libvirt-hooks-with-the-vfio-tools-hook-helper/
- (Optional) Another machine to SSH/VNC to your host with for testing might be useful

With all this ready. Let's move on to how to actually do this.

Procedure

Patching the GPU Rom for the VM

First of all, we need a usable ROM for the VM. When the boot GPU is already initialized, you're going to get an error from QEMU about usage count. This will fix that problem

Get a rom for your GPU
- You can either download one from here https://www.techpowerup.com/vgabios/ or
- Use nvflash to dump the bios currently on your GPU. nvflash is pretty straigh forward, but I won't cover it here.
Patch the BIOS file:

With Nvidia vBios Patcher

In the directory where you saved the original vbios, use the patcher tool.

python nvidia_vbios_vfio_patcher.py -i <ORIGINAL_ROM> -o <PATCHED_ROM>

Now you should have a patched vbios file, which you should place where you can remember it later. I store mine with other libvirt files in /var/lib/libvirt/vbios/

Manually

Use the dumped/downloaded bios and open it in a hex editor.

Search in the strings for the line including "VIDEO" that starts with a "U"

Delete all of the code above the found line.

Save!

Attach the PCI device to your VM
- In libvirt, use "+ Add Hardware" -> "PCI Host Device" to add the video card and audio device
Edit the libvirt XML file for the VM and add the patched vbios file that we've generated

sudo virsh edit {VM Name}

<hostdev>
	...
	<rom file='/var/lib/libvirt/vbios/patched-bios.bin'/>
	...
</hostdev>

Save and close the XML file

Setting up Libvirt hooks

Using libvirt hooks will allow us to automagically run scripts before the VM is started and after the VM has stopped.

Using the instructions here https://passthroughpo.st/simple-per-vm-libvirt-hooks-with-the-vfio-tools-hook-helper/ to install the base scripts, you'll find a directory structure that now looks like this:

/etc/libvirt/hooks
├── qemu <- The script that does the magic
└── qemu.d
    └── {VM Name}
        ├── prepare
        │   └── begin
        │       └── start.sh
        └── release
            └── end
                └── revert.sh

Anything in the directory /etc/libvirt/hooks/qmeu.d/{VM Name}/prepare/begin will run when starting your VM

Anything in the directory /etc/libvirt/hooks/qemu.d/{VM Name}/release/end will run when your VM is stopped

Libvirt Hook Scripts

I've made my start script /etc/libvirt/hooks/qemu.d/{VMName}/prepare/begin/start.sh

Start Script

#!/bin/bash
  
# Stop display manager
systemctl stop display-manager.service
  
# Unbind VTconsoles
echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
  
# Unbind EFI-Framebuffer
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
  
# Unbind the GPU from display driver
echo -n "0000:01:00.0" > /sys/bus/pci/drivers/nvidia/unbind
echo -n "0000:01:00.1" > /sys/bus/pci/drivers/snd_hda_intel/unbind
  
# Load VFIO Kernel Module  
modprobe vfio-pci  
  
sleep 1

VM Stop script

My stop script is /etc/libvirt/hooks/qmeu.d/{VMName}/release/end/revert.sh

#!/bin/bash
  
# Unload VFIO-PCI Kernel Driver
modprobe -r vfio-pci
  
# Re-Bind GPU to Nvidia Driver
echo -n "0000:01:00.0" > /sys/bus/pci/drivers/nvidia/bind
echo -n "0000:01:00.1" > /sys/bus/pci/drivers/snd_hda_intel/bind
  
# Wait 1 second to avoid possible race condition
sleep 1
  
# Re-Bind EFI-Framebuffer
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/bind
  
# Re-bind to virtual consoles
echo 1 > /sys/class/vtconsole/vtcon0/bind
echo 1 > tee /sys/class/vtconsole/vtcon1/bind
  
# Restart Display Manager
systemctl start display-manager.service

Running the VM

When running the VM, the scripts should now automatically stop your display manager, unbind your GPU from all drivers currently using it and pass control over the libvirt. Libvirt handles binding the card to VFIO-PCI automatically.

When the VM is stopped, Libvirt will also handle removing the card from VFIO-PCI. The stop script will then rebind the card to Nvidia and SHOULD rebind your vtconsoles and EFI-Framebuffer.

TODO: QEMU Scripts without Libvirt

This is also possible, but will require a significantly different process. I might write another process all together and separate the two entirely.

Want to test on other GPUs/Distributions/Other mad scientist stuff?

Please let me know what you find!

As always. Make a pull request or issue. The issue tracker has already solved one problem for me.

Fuel my coffee addiction

Always appreciated, never required.

ETH: 0xE4Bf3fc0562f7F63d0F9dF94E87e01C217D30918

elypter / Single-GPU-Passthrough