Skip to main content

LXC GPU Access

Giving a LXC containerguest GPU access allows you to use a GPU in a guest while it is still being available to the host machine. This is a big advantage over virtual machines where only a single host or guest can have access to a GPU at one time. Even better, multipel LXC guests can share a GPU from the host at the same time.

Determine Device Major/Minor Numbers

To allow a container access to the device you'll have to know the devices major/minor numbers. This can be found easily enough by running ls -l in /dev/. As an example to pass through the integated UHD 630 GPU from an Core i7 8700k you would first list the devices where are created under /dev/dri.

root@blackbox:~# ls -l /dev/dri
total 0
drwxr-xr-x 2 root root         80 May 12 21:54 by-path
crw-rw---- 1 root video  226,   0 May 12 21:54 card0
crw-rw---- 1 root render 226, 128 May 12 21:54 renderD128

From that you can see the major device number is 226 and the minors are 0 and 128.

Provide LXC Access

In ourthe configuration file you'd then add lines to allow itthe LXC guest access to that device,device and then also bind mount the devices.devices from the host into the guest. In the example above since both devices share the same major number it is possible to use a shorthand notation of 226:* to represent all minor numbers with major number 226.

# /etc/pve/lxc/*.conf
+ lxc.cgroup.devices.allow: c 226:* rwm
+ lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file,mode=0666
+ lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file

Allow unprivileged Containers Access

In the example above we saw that card0 and renderD128 are both owned by root and have their groups set to video and render. You'llBecause needthe unprivilged part of LXC works by mapping the UIDs (user id) and GIDs (group id) in the LXC guest namespace to explicitlya maphigh range of ids on host, it is necessary to create a custom mapping for that namespace that maps those groups in the groupLXC IDsguest fromnamespace to the host systemgroups towhile leaving the guestrest containerunchanged usingso IDyou maps.don't lose the added security of running an unprivilged container.

First you need to give root permission to map the group IDs. You can look in /etc/group to find the GIDs of those groups, but in this example video = 44 and render = 108 on our host system. You should add the following lines that allow root to map those groups to a new GID.

# /etc/subgid
+ root:44:1
+ root:108:1

Then you'll need to create the ID mapping.mappings. TheSince you're just dealing with groups mapping the UIDs can be done in a single line as shown on the first lineline. mapsIt usercan IDsbe 0-65536read as "remap 65,536 of the LXC guest namespaces UIDs from 0 through 65,536 to IDsa range in the host starting at 100,000.000. ThenYou wecan maptell this relates to UIDs because of the IDsu beforedenoting users. It wasn't necessary to edit /etc/subuid because that file already gives root permission to perform this mapping.

You have to do the onesame wething carefor aboutgroups which is the same concept but slightly more verbose. In this example when looking at /etc/group in the LXC guest it shows that video and render have GIDs of 44 and 106. Although you'll use g to denote GIDs everything else is the same except it is necessary to ensure the custom mappings cover the whole range of GIDs so it requires more lines. The only tricky part is the second to last line that shows mapping the LXC guest namespace GID for render (0-44)106) to the host GID for render (108) because the groups have different GIDs.

# /etc/pve/lxc/*.conf
  lxc.cgroup.devices.allow: c 226:* rwm
  lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file,mode=0666
  lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
+ lxc.idmap: u 0 100000 65536
+ lxc.idmap: g 0 100000 44
+ lxc.idmap: g 44 44 1
+ lxc.idmap: g 45 100045 60
+ lxc.idmap: g 106 108 1
+ lxc.idmap: g 107 100107 65429

Ensure IOMMU Is Activated

First step of this process is to make sure that your hardware is even capable of this type of virtualization. You need to have a motherboard, CPU, and BIOS that has an IOMMU controller and supports Intel-VT-x and Intel-VT-d or AMD-v and AMD-vi. Some motherboards use different terminology for these, for example they may list AMD-v as SVM and AMD-vi as IOMMU controller.

Update Bootloader

Update Kernel Parameters

**NOTE** Be sure to replace intel_iommu=on with amd_iommu=on if you're running on AMD instead of Intel.

Grub2
# /etc/default/grub
- GRUB_CMDLINE_LINUX_DEFAULT="quiet"
+ GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt
Systemd
# /etc/kernel/cmdline
- root=ZFS=rpool/ROOT/pve-1 boot=zfs
+ root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt

Rebuild Bootloader Options

Grub
update-grub
systemd-boot
bootctl update
Proxmox
pve-efiboot-tool refresh

Enable Virtual Functions

cat /sys/class/net/enp10s0f0/device/sriov_totalvfs
7

To enable virtual functions you just echo the number you want to sriov_numvfs in sysfs...

echo 4 > /sys/class/net/enp10s0f0/device/sriov_numvfs

Make Persistent

Sysfs is a virtual file system in Linux kernel 2.5+ that provides a tree of system devices. This package provides the program 'systool' to query it: it can list devices by bus, class, and topology.

In addition this package ships a configuration file /etc/sysfs.conf which allows one to conveniently set sysfs attributes at system bootup (in the init script etc/init.d/sysfsutils).

apt install sysfsutils

Configure sysfsutils

To make these changes persistent, you need to update /etc/sysfs.conf so that it gets set on startup.

echo "class/net/eth2/device/sriov_numvfs = 4" >> /etc/sysfs.conf