Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amdgpu: add kernelModule.patches option #321663

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions nixos/modules/services/hardware/amdgpu-kernel-module.nix
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# TODO there should probably be a generic mechanism to patch any in-kernel module like this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really keen on the idea, which problems would it solve specifically for amdgpu module?

If anything like this to be introduced, I think it should be standardised and available for all modules. Perhaps @K900 wants to take a look here as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had two issues that I needed to patch the amdgpu module for so far:

  1. SteamVR needs to be able to create a high-priority queue but you can't give it CAP_SYS_NICE because we run Steam inside of a userns. I must patch the amdgpu kernel module to allow any process to create such a queue (and potentially DOS the system but idc).
  2. The Framework 16 Laptop has a quirk and the display brightness does not go as low as it should and the lowest is too bright for dim environments. Until FW releases a firmware fix for this (which would be the proper fix), the module must be patched to add the quirk.

I'd also like this to be a standard option but I think that can come later. No need to perfect it in the V1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really, really don't want to be setting this precedent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I could certainly see this being a footgun, I don't really see any great danger here; could you elaborate?

I was also planning on making the unprivileged high-priority queue patch into a stand-alone option and providing some hacks in the Steam module to make the SteamVR setup experience smoother.

https://github.com/Atemu/nixos-config/blob/93b9546d1698286367bdb8826a3d827d5527128c/modules/gaming/module.nix#L116-L126

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a generic mechanism would be nice, is there any way to unstall this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the generic mechanism.

I think a challenge here could be to make it easy to configure for users (i.e. boot.kernelModules."amdgpu".patches = [...]), as we would somehow need to map module names to module paths within the kernel source tree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building a good interface here would be hard as some modules that are conceptually one unit may be constituted of multiple individual module files.

That's at least part of the reason why I limited my work to AMDGPU first. I'd prefer if we just merged this as iteration 1 and came up with a good design for a generic mechanism in iteration 2.

I know it's hard but we need to put a limit on our perfectionism sometimes ;)

@K900 could you share your doubts about "setting this precedent"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly my concern is that we're giving users a tool that's very sharp and very easy to hold wrong, and the failure mode is not great.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the initial goal of this, that being the capability issues, should really be fixed somewhere at a higher level (possibly in bwrap itself even?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the initial goal of this, that being the capability issues, should really be fixed somewhere at a higher level (possibly in bwrap itself even?)

See #217119

In short:
Applications inside of user namespaces can never have effective file capabilities. This is just a design decision of the kernel. There have been numerous attempts to move away from capabilities specifically for high priority DRM contexts (which is what SteamVR needs) but none of them have made it into the kernel yet.

{
stdenv,
kernel, # The kernel to patch
patches ? [ ],
}:

stdenv.mkDerivation {
pname = "amdgpu-kernel-module-customised";
inherit (kernel)
src
version
postPatch
nativeBuildInputs
modDirVersion
;
patches = kernel.patches or [ ] ++ patches;

modulePath = "drivers/gpu/drm/amd/amdgpu";

buildPhase = ''
BUILT_KERNEL=${kernel.dev}/lib/modules/$modDirVersion/build

cp $BUILT_KERNEL/Module.symvers $BUILT_KERNEL/.config ${kernel.dev}/vmlinux ./

make "-j$NIX_BUILD_CORES" modules_prepare
make "-j$NIX_BUILD_CORES" M=$modulePath modules
'';

installPhase = ''
make \
INSTALL_MOD_PATH="$out" \
XZ="xz -T$NIX_BUILD_CORES" \
M="$modulePath" \
modules_install
'';

meta = {
description = "AMDGPU kernel module";
inherit (kernel.meta) license;
};
}
57 changes: 52 additions & 5 deletions nixos/modules/services/hardware/amdgpu.nix
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,54 @@ in {
series cards. Note: this removes support for analog video outputs,
which is only available in the `radeon` driver
'';
initrd.enable = lib.mkEnableOption ''
loading `amdgpu` kernelModule in stage 1.
Can fix lower resolution in boot screen during initramfs phase
'';
kernelModule = {
inInitrd = lib.mkEnableOption ''
installing the `amdgpu` kernelModule into the initrd; making it
available in stage 1 of the boot process.

This allows for an earlier modeset to apply the preferred resolution in
the beginning of the initramfs phase rather than after it.
'';

patches = lib.mkOption {
type = with lib.types; listOf path;
default = [ ];
description = ''
Patches to apply to the kernel for the `amdgpu` kernel module build.

This is intended for applying small patches concerning only the
`amdgpu` module's internals without needing to rebuild the entire
kernel.

The patches are applied to the entire kernel tree but only the
`amdgpu` module will actually be built and used. You should therefore
not touch anything outside of `drivers/gpu/drm/amd/amdgpu` using the
patches as those modifications will not be present in the actual
kernel you will be running which might cause undefined and likely
erroneous behaviour.
Use {option}`boot.kernelPatches` instead for such cases.

A reboot is required for the patched module to be loaded.
'';
example = lib.literalExpression ''
[
(pkgs.fetchpatch2 {
url = "https://lore.kernel.org/lkml/20240610-amdgpu-min-backlight-quirk-v1-1-8459895a5b2a@weissschuh.net/raw";
hash = "";
})
]
'';
};
};
opencl.enable = lib.mkEnableOption ''OpenCL support using ROCM runtime library'';
# cfg.amdvlk option is defined in ./amdvlk.nix module
};

imports = [
# This can be removed post 24.11; it was only ever in unstable
(lib.mkRenamedOptionModule [ "hardware" "amdgpu" "initrd" "enable" ] [ "hardware" "amdgpu" "kernelModule" "inInitrd" ])
];

config = {
boot.kernelParams = lib.optionals cfg.legacySupport.enable [
"amdgpu.si_support=1"
Expand All @@ -26,7 +66,14 @@ in {
"radeon.cik_support=0"
];

boot.initrd.kernelModules = lib.optionals cfg.initrd.enable [ "amdgpu" ];
boot.initrd.kernelModules = lib.optionals cfg.kernelModule.inInitrd [ "amdgpu" ];

boot.extraModulePackages = lib.mkIf (cfg.kernelModule.patches != [ ]) [
(pkgs.callPackage ./amdgpu-kernel-module.nix {
inherit (config.boot.kernelPackages) kernel;
inherit (cfg.kernelModule) patches;
})
];

hardware.opengl = lib.mkIf cfg.opencl.enable {
enable = lib.mkDefault true;
Expand Down