Operating system virtualization is a convenient way to run multiple operating systems at the same time, on the same hardware, without requiring rebooting. There are several technologies that allow sharing of the GPU by both the host (native) and guest (virtualized) operating systems. Virtualization makes it easy and a relatively fear-free way to use and or hack guest operating system instances as a simple file copy or restore operation will bring a guest machine back to life. One ideal is to purchase an SLI twin GPU and use one with the host operating system and one with the guest operating system. NVIDIA call this Multi-OS, which is supported by some Quadro GPUs. Not for the faint of heart (but for those on a budget), one can modify a GeForce GPU to act like a Quadro: Part 1, Part 2, Part 3. Alternative approaches provide a network-based interface to virtualize access to the GPUs (e.g. rCUDA and Virtual OpenCL).
There are two core ways that a virtual machine can give a guest operating system access to the GPU device(s)
- GPU pass through – All PCIe commands are passed between the GPU and guest VM (Virtual Machine). The native graphic drivers for the GPU are installed which means the guest operating system should realize the native graphic performance with that card.
- Shared Virtualized GPU – A hypervisor sits between the VM and the GPU which intercepts the PCIe commands to route them between the device(s) and guest operating systems.
The masters thesis by Kristoffer Robin Stokke at the University of Oslo provides a nice introduction as well as introducing his Vocale virtualization software.
Several solutions exist to provide virtual access to a GPU:
- Nvidia Grid or Virtual Desktop Infrastructure (VDI)
Nvidia Grid is a commercial implementation described by NVIDIA as:
… the practice of hosting a desktop OS within a virtual machine (VM) running on a hosted, centralized or remote server. It allows IT departments to offer a secure “access anywhere” desktop experience to end users.
GameStream is the recently introduced personal streaming of games looks like it leverages VDI to let NVIDIA Shield devices exploit the power of the desktop NVIDIA GPU and low network latency of the users home network. A beta feature is also available for those who wish to try GameStream remotely. The beta feature isn’t available for notebooks just yet, and you’ll need a recommended bandwidth of 5Mbps upload and download on the remote Wi-Fi location. (Users can also broadcast their finer – or not so fine – gameplay moments with NVIDIA Shadowplay. Google just bought Twitch for around $1B, which makes the automatic Twitch stream naming with the title of the game you’re currently playing makes it a convenient way to show your gaming prowess.)
Shadowplay will also auto-name your Twitch stream with the title of the game you’re currently playing, and wider broadcasting options–custom output resolution, custom framerate–should speed up the experience.
- VGA passthrough
VGA passthrough is an older technology that essentially performs a video overlay somewhat similar to a televisions picture-in-picture capability. The Xen hypervisor provides this capability for free as of Xen 4.0. The good news is that VGA passthrough is known to work with Intel Integrated Graphics Devices (IGDs) as of Xen 4.0, which can enable devices with 800 GF/s of performance. The bad news is that Xen VGA graphics passthrough is a special form of PCI passthrough, that dedicates the PCI device (graphics card) to exactly one single VM.
- VMware supports GPGPU computing via Vsphere. Applications can access GPGPUs via CUDA or OpenCL in exactly the same way as when running natively — no changes are required to the application.
- VirtualBox PCIe passthrough
Virtualbox provides PCIe passthrough. Supposedly this will work with Linux host operating systems, but I have not found reports of people actually running such a system.
- Other Alternatives:
Instead of virtualization, one can also look into rCUDA
or virtual OpenCL, which is an OpenCL platform that can transparently run unmodified OpenCL applications on a cluster with many devices, as if all the devices are on each hosting node.