feat: build nvidia open source kernel module #220

p5 · 2024-07-22T08:49:41Z

Enable the open kernel module builds.
This should not affect anything downstream. Just gets us ready for the switch next Nvidia driver release.

Please note: I have not yet tried booting into an image with this driver.

p5 · 2024-07-22T09:13:20Z

Containerfile.nvidia-open

I copied this Containerfile because I didn't want this PR to be rewriting the workflow logic too. If we wanted to use the same Containerfile, we would need to rework the GHA jobs to supply additional build args only for Nvidia.

dylanmtaylor · 2024-07-22T23:26:48Z

"For cutting-edge platforms such as NVIDIA Grace Hopper or NVIDIA Blackwell, you must use the open-source GPU kernel modules. The proprietary drivers are unsupported on these platforms.

For newer GPUs from the Turing, Ampere, Ada Lovelace, or Hopper architectures, NVIDIA recommends switching to the open-source GPU kernel modules."
https://developer.nvidia.com/blog/nvidia-transitions-fully-towards-open-source-gpu-kernel-modules/

I think we should invert this -- default to nvidia-open with nvidia-closed being the edge case for older GPUs as that is what Nvidia is pushing for.

This wouldn't affect this PR per-se, as the default choice would be set in the downstream image builds.

m2Giles · 2024-07-23T02:54:35Z

build-kmod-nvidia.sh


 akmods --force --kernels "${KERNEL_VERSION}" --kmod "nvidia"

 modinfo /usr/lib/modules/${KERNEL_VERSION}/extra/nvidia/nvidia{,-drm,-modeset,-peermem,-uvm}.ko.xz > /dev/null || \
 (cat /var/cache/akmods/nvidia/${NVIDIA_AKMOD_VERSION}-for-${KERNEL_VERSION}.failed.log && exit 1)

+# View license information
+modinfo -l /usr/lib/modules/${KERNEL_VERSION}/extra/nvidia/nvidia{,-drm,-modeset,-peermem,-uvm}.ko.xz


While this lists it out, can we do a condition check to make sure that the correct licensed kmod was built given the input.

It looks like we requested to do a check here but we didn't?

m2Giles · 2024-07-23T02:55:56Z

Containerfile.nvidia-open

@@ -0,0 +1,63 @@
+###
+### Containerfile.nvidia - used to build ONLY NVIDIA kmods


I think we can just symlink this instead of it's going to just be an exact copy of the Nvidia containerfile.

Right now it's separate but it could be an if in the containerfile

Containerfile.nvidia

…efault

Containerfile.nvidia-open

p5 · 2024-08-07T22:56:05Z

LGTM! (Can't approve my own PR)

p5 added 3 commits July 22, 2024 09:49

feat: build nvidia open source kernel module

dbe504a

chore: output license information

b24a83b

fix: actually switch the kernel module type

9971e70

p5 marked this pull request as ready for review July 22, 2024 09:12

p5 requested a review from castrojo as a code owner July 22, 2024 09:12

p5 commented Jul 22, 2024

View reviewed changes

p5 mentioned this pull request Jul 22, 2024

feat(nvidia): Attempt to support NVIDIA open #82

Closed

castrojo previously approved these changes Jul 22, 2024

View reviewed changes

p5 enabled auto-merge July 22, 2024 22:10

m2Giles reviewed Jul 23, 2024

View reviewed changes

chore: Update sed command to properly switch no matter the upstream d…

b0088f2

…efault

KyleGospo dismissed castrojo’s stale review via b0088f2 August 7, 2024 22:46

KyleGospo previously approved these changes Aug 7, 2024

View reviewed changes

castrojo previously approved these changes Aug 7, 2024

View reviewed changes

chore: Pass param through directly

3dbeaf2

KyleGospo dismissed stale reviews from castrojo and themself via 3dbeaf2 August 7, 2024 22:52

KyleGospo disabled auto-merge August 7, 2024 22:53

p5 commented Aug 7, 2024

View reviewed changes

Containerfile.nvidia-open Outdated Show resolved Hide resolved

chore: Pass kernel-open for nvidia-open builds

ac6b7b6

KyleGospo enabled auto-merge August 7, 2024 22:55

KyleGospo approved these changes Aug 7, 2024

View reviewed changes

castrojo approved these changes Aug 7, 2024

View reviewed changes

KyleGospo added this pull request to the merge queue Aug 7, 2024

Merged via the queue into main with commit ab12663 Aug 7, 2024
42 checks passed

KyleGospo deleted the enable-nvidia-open-gpu-builds branch August 7, 2024 23:51

github-actions bot mentioned this pull request Aug 7, 2024

chore(main): release 1.1.0 #200

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: build nvidia open source kernel module #220

feat: build nvidia open source kernel module #220

p5 commented Jul 22, 2024 •

edited

Loading

p5 Jul 22, 2024

dylanmtaylor commented Jul 22, 2024 •

edited

Loading

m2Giles Jul 23, 2024

KyleGospo Aug 7, 2024

bsherman Aug 22, 2024

m2Giles Jul 23, 2024

KyleGospo Aug 7, 2024

p5 commented Aug 7, 2024

		@@ -0,0 +1,63 @@
		###
		### Containerfile.nvidia - used to build ONLY NVIDIA kmods

feat: build nvidia open source kernel module #220

feat: build nvidia open source kernel module #220

Conversation

p5 commented Jul 22, 2024 • edited Loading

p5 Jul 22, 2024

Choose a reason for hiding this comment

dylanmtaylor commented Jul 22, 2024 • edited Loading

m2Giles Jul 23, 2024

Choose a reason for hiding this comment

KyleGospo Aug 7, 2024

Choose a reason for hiding this comment

bsherman Aug 22, 2024

Choose a reason for hiding this comment

m2Giles Jul 23, 2024

Choose a reason for hiding this comment

KyleGospo Aug 7, 2024

Choose a reason for hiding this comment

p5 commented Aug 7, 2024

p5 commented Jul 22, 2024 •

edited

Loading

dylanmtaylor commented Jul 22, 2024 •

edited

Loading