Faster Virtual Machines on Linux Hosts with GPU Acceleration

Overview

Open source virtualization technologies widely available in the Linux software ecosystem lack the ease of use of graphical performance enhancements available in commercial virtualization technologies such as VMWare Workstation or VMWare vSphere/ESXi. Intel GVT-g is a virtual graphics acceleration technology which can be accessed with the QEMU virtualization system. QEMU serves as an open-source alternative to technologies such as VMWare Workstation or VMWare vSphere/ESXi. Intel GVT-g was configured on a Thinkpad X1 Generation 6 laptop containing Intel integrated graphics resulting in successful GPU acceleration on a UEFI Windows 10 64-bit guest without relying on proprietary software aside from the guest operating system itself. Substantially improved virtualization performance is possible due to working Intel GVT-g GPU acceleration on Linux hosts.

Introduction

Computer users rely on software written for many different operating systems. Virtual machines allow computer users to simultaneously run different operating systems and switch between them easily. Virtualization has benefits such as being able to migrate installed systems to other physical machines with lower downtime, the ability to contain untrusted code in a sandbox that is difficult to escape from, maintain operation of legacy systems that are difficult to keep running on obsolete hardware, or simply running a Windows-only program on a Linux host.

Virtual machines with graphical user interfaces typically suffer from input lag and stuttering, both of which lead to a degraded user experience. Additionally, software which relies on heavy computation such as photo editing or engineering is dependent on efficient GPU access to speed up calculations by an order of magnitude or more over the host machine CPU to finish calculations in a reasonable time period. Unfortunately, not all virtualization solutions are able to leverage the physical chips on the host machine in an efficient manner, regardless of cost.

GPU command architectures

  1. VGA Emulation (VE)
    • Universally available on all virtualization platforms
  2. API forwarding (AF)
    • Intel GVT-s
    • VMWare Virtual Shared Graphics Acceleration (vSGA)
    • Oracle VirtualBox 3D Acceleration
  3. Direct Pass-Through (DPT)
    • Intel GVT-d
    • VMWare Virtual Dedicated Graphics Acceleration (vDGA)
    • Not available in VMWare Workstation
  4. Full GPU Virtualization (FGV)
    • Intel GVT-g
    • VMWare Virtual Shared Pass-Through Graphics Acceleration (vGPU or MxGPU)
    • Not available in VMWare Workstation

VGA Emulation (VE)

The most primitive graphics display for any virtual machine is VGA Emulation (VE). This mode is also the most inefficient. QEMU emulates a Cirrus Logic GD5446 Video card. All Windows versions starting from Windows 95 should recognize and use this graphic card.

Most hypervisors which advertise some form of "hardware acceleration" use API Forwarding (AF), which is a high performance proxy service that requires specialized drivers on both the host and guest to create a high performance instruction pipeline.

API forwarding (AF)

API Forwarding (AF) works by:

  1. intercepting the GPU command requested by a piece of software
  2. proxying the GPU command to the host hypervisor
  3. executing the captured GPU command on the host from the hypervisor
  4. bubbling the response back up to the virtual machine

This mode is very useful when many virtual machines are competing for resources of a single GPU and Full GPU Virtualization (FGV) is not possible. The hypervisor queues graphics card operations from one or more virtual machine and schedules virtual execution and memory slots for each virtual machine on a single physical GPU resource. Each virtual machine sees its own graphics card while the hypervisor splits the single physical resource up. A key drawback of AF is that usually only OpenGL and DirectX interfaces are supported by the GPU instruction proxy.

The process by which API Forwarding (AF) works is known as paravirtualization.

Direct Pass-Through (DPT)

Direct Pass-Through (DPT) is a system which exposes the GPU as a PCI device which is directly addressable by the virtual machine. Nothing besides the virtual machine can reference any resources on the GPU and it cannot be shared with the physical machine or any other virtual machines. Many devices have only one graphics card installed and using this system would mean making the graphical user interface inoperable. This method is most useful when:

Full GPU Virtualization (FGV)

Sharing a GPU natively among multiple virtual machines is possible with Full GPU Virtualization (FGV) solutions such as Intel GVT-g. This process is also known as Hardware Assisted Virtualization (HVM), not to be confused with Paravirtualization (PV). In this mode the IOMMU hardware exposes a GPU memory interface to each virtual machine while it internally handles the memory address mappings between what it exposes to virtual machines and the actual physical memory on the GPU.

In "IOMMU and Virtualization," Susanta Nanda writes:

IOMMU provides two main functionalities: virtual-to-physical address translation and access protection on the memory ranges that an I/O device is trying to operate on.

Peak media workload performance is 95% of the native host alone when running one virtual machine and the average performance is 85% of the native host alone on media workloads according to Intel engineer Zhenyu Wang in XDC2017 presentation "Full GPU virtualization in mediated pass-through way"

Hypervisors

The hypervisors I use are software systems that enable multiple virtual machines to run simultaneously on a single physical machine. Linux users have many hypervisor options. A longer list is available here. These are a sample of some Linux hypervisors:

  1. Oracle VirtualBox
  2. VMWare vSphere/ESXi (technically runs beneath Linux)
  3. VMWare Workstation
  4. QEMU

Oracle VirtualBox

Possibly the most widely-used hypervisor is VirtualBox by Oracle. VirtualBox is open source software with the exception of the optional extension pack.

The Oracle VirtualBox extension pack provides many features which are not available in the free version. The PCI passthrough module was shipped as a Oracle VM VirtualBox extension package until the feature was scrapped.

These features are available gratis for personal and non-commerical use only.

Possession of the VirtualBox Extension Pack without a license can be problematic:

Got an email today informing me (Urgent Virtual Box Licensing Information for Company X) that there have been TWELVE (12!) downloads of the VirtualBox Extension Pack at my employer in the past year. And since the extensions are licensed differently than the base product, they'd love for us to call them and talk about how much money we owe them. Their report attached to email listed source IPs and AS number, as well as date/product/version. Out of the twelve (12!), there were always two on the same day of the same version, so really six (6!) downloads. We'll probably end up giving them $150, and I'll make sure they never get any business from places I work, because fuck Oracle. I wouldn't piss on Larry Ellison if he was on fire.

VirtualBox Linux hosts do not support GPU DPT (Direct Pass-Through) at all. All of the preliminary PCI pass-through work for Linux hosts which is needed for GPU DPT was completely stripped out on December 5th, 2019 with this message:

Linux host: Drop PCI passthrough, the current code is too incomplete (cannot handle PCIe devices at all), i.e. not useful enough

VirtualBox 2D and 3D acceleration both work according to the same principle: API forwarding (AF)

Oracle VM VirtualBox implements 3D acceleration by installing an additional hardware 3D driver inside the guest when the Guest Additions are installed. This driver acts as a hardware 3D driver and reports to the guest operating system that the virtual hardware is capable of 3D hardware acceleration. When an application in the guest then requests hardware acceleration through the OpenGL or Direct3D programming interfaces, these are sent to the host through a special communication tunnel implemented by Oracle VM VirtualBox. The host then performs the requested 3D operation using the host's programming interfaces.

VMWare vSphere/ESXi

VMWare vSphere/ESXi is a bare metal hypervisor which runs beneath any end-user operating systems. This property makes it a Type 1 Hypervisor. It supports all GPU acceleration technologies.

The main limitation of VMWare vSphere/ESXi GPU acceleration is the graphics card selection. GPU passthrough is possible only with a small set of GPUs because NVIDIA drivers disable consumer-market GPUs such as the GeForce series when the drivers detect that they are running in a virtual environment. Comprehensive list of all supported graphics cards for any hardware acceleration purposes.

VMWare Workstation

VMWare Workstation only provides Virtual Shared Graphics Acceleration (vSGA), a form of API forwarding (AF). In this regard, the GPU acceleration story is identical to Oracle VirtualBox.

QEMU

QEMU is an open source virtual machine platform that is also capable of translating instructions between wholly unrelated computer architectures. It is widely available in most Linux distributions and is used extensively in industry.

After enabling GVT-g in QEMU you must also recompile QEMU with the 60 fps fix to get smooth video as of the publication date of this article.

Conclusion

QEMU eclipses VirtualBox in features and exceeds VMWare capabilities. VirtualBox is limited to API forwarding (AF) since it is not able to allow virtual machines to address graphics hardware directly in any way. VMWare solutions support all types of GPU addressing but most graphic cards made by NVIDIA disable themselves when they detect being called in Direct Pass-Through (DPT) or Full GPU Virtualization (FGV) modes. QEMU exceeds the features and provides hardware use flexibility beyond that of VMWare to bring near-native graphics performance to guest operating systems such as Windows 10 with truly minimal driver support required in the guest operating system. I recommend using QEMU on Linux when high graphics performance and low operational costs are prioritized for deploying a virtual machine environment.