LXC GPU Access
Giving a LXC containerguest GPU access allows you to use a GPU in a guest while it is still being available to the host machine. This is a big advantage over virtual machines where only a single host or guest can have access to a GPU at one time. Even better, multipel LXC guests can share a GPU from the host at the same time.
Determine Device Major/Minor Numbers
To allow a container access to the device you'll have to know the devices major/minor numbers. This can be found easily enough by running ls -l
in /dev/
. As an example to pass through the integated UHD 630 GPU from an Core i7 8700k you would first list the devices where are created under /dev/dri
.
root@blackbox:~# ls -l /dev/dri
total 0
drwxr-xr-x 2 root root 80 May 12 21:54 by-path
crw-rw---- 1 root video 226, 0 May 12 21:54 card0
crw-rw---- 1 root render 226, 128 May 12 21:54 renderD128
From that you can see the major device number is 226
and the minors are 0
and 128
.
Provide LXC Access
In ourthe configuration file you'd then add lines to allow itthe LXC guest access to that device,device and then also bind mount the devices.devices from the host into the guest. In the example above since both devices share the same major number it is possible to use a shorthand notation of 226:*
to represent all minor numbers with major number 226
.
# /etc/pve/lxc/*.conf
+ lxc.cgroup.devices.allow: c 226:* rwm
+ lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file,mode=0666
+ lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
Allow unprivileged
Containers Access
In the example above we saw that card0
and renderD128
are both owned by root
and have their groups set to video
and render
. You'llBecause needthe unprivilged part of LXC works by mapping the UIDs (user id) and GIDs (group id) in the LXC guest namespace to explicitlya maphigh range of ids on host, it is necessary to create a custom mapping for that namespace that maps those groups in the groupLXC IDsguest fromnamespace to the host systemgroups towhile leaving the guestrest containerunchanged usingso IDyou maps.don't lose the added security of running an unprivilged container.
First you need to give root permission to map the group IDs. You can look in /etc/group
to find the GIDs of those groups, but in this example video
= 44
and render
= 108
on our host system. You should add the following lines that allow root
to map those groups to a new GID.
# /etc/subgid
+ root:44:1
+ root:108:1
Then you'll need to create the ID mapping.mappings. TheSince you're just dealing with groups mapping the UIDs can be done in a single line as shown on the first lineline. mapsIt usercan IDsbe 0-65536read as "remap 65,536
of the LXC guest namespaces UIDs from 0
through 65,536
to IDsa range in the host starting at 100,
. 000.000ThenYou wecan maptell this relates to UIDs because of the IDsu
beforedenoting users. It wasn't necessary to edit /etc/subuid
because that file already gives root permission to perform this mapping.
You have to do the onesame wething carefor aboutgroups which is the same concept but slightly more verbose. In this example when looking at /etc/group
in the LXC guest it shows that video
and render
have GIDs of 44
and 106
. Although you'll use g
to denote GIDs everything else is the same except it is necessary to ensure the custom mappings cover the whole range of GIDs so it requires more lines. The only tricky part is the second to last line that shows mapping the LXC guest namespace GID for render
(0-44)106
) to the host GID for render
(108
) because the groups have different GIDs.
# /etc/pve/lxc/*.conf
lxc.cgroup.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file,mode=0666
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
+ lxc.idmap: u 0 100000 65536
+ lxc.idmap: g 0 100000 44
+ lxc.idmap: g 44 44 1
+ lxc.idmap: g 45 100045 60
+ lxc.idmap: g 106 108 1
+ lxc.idmap: g 107 100107 65429
Ensure IOMMU Is Activated
First step of this process is to make sure that your hardware is even capable of this type of virtualization. You need to have a motherboard, CPU, and BIOS that has an IOMMU controller and supports Intel-VT-x and Intel-VT-d or AMD-v and AMD-vi. Some motherboards use different terminology for these, for example they may list AMD-v as SVM and AMD-vi as IOMMU controller.
Update Bootloader
Update Kernel Parameters
**NOTE** Be sure to replace intel_iommu=on with amd_iommu=on if you're running on AMD instead of Intel.
Grub2
# /etc/default/grub
- GRUB_CMDLINE_LINUX_DEFAULT="quiet"
+ GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt
Systemd
# /etc/kernel/cmdline
- root=ZFS=rpool/ROOT/pve-1 boot=zfs
+ root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt
Rebuild Bootloader Options
Grub
update-grub
systemd-boot
bootctl update
Proxmox
pve-efiboot-tool refresh
Enable Virtual Functions
Find the link name you want to add virtual function to using ip link. In this scenario we're going to say we want to add 4 virtual functions to link eth2. You can find the maximum number of virtual function possible by reading the sriov_totalvfs from sysfs...
cat /sys/class/net/enp10s0f0/device/sriov_totalvfs
7
To enable virtual functions you just echo the number you want to sriov_numvfs in sysfs...
echo 4 > /sys/class/net/enp10s0f0/device/sriov_numvfs
Make Persistent
Sysfs is a virtual file system in Linux kernel 2.5+ that provides a tree of system devices. This package provides the program 'systool' to query it: it can list devices by bus, class, and topology.
In addition this package ships a configuration file /etc/sysfs.conf which allows one to conveniently set sysfs attributes at system bootup (in the init script etc/init.d/sysfsutils).
apt install sysfsutils
Configure sysfsutils
To make these changes persistent, you need to update /etc/sysfs.conf so that it gets set on startup.
echo "class/net/eth2/device/sriov_numvfs = 4" >> /etc/sysfs.conf