Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
nvproxy: add ioctl
NV_CONF_COMPUTE_CTRL_CMD_GPU_GET_KEY_ROTATION_STATE
Hey, this adds a missing ioctl required to run workloads on H100s with CC mode on. I couldn't find the respective ioctl in any supported driver version prior to 550.90.07, hence I added it only to that version's ABI. Without this patch the following example crashes: ```bash $ docker run --runtime=runsc --gpus=all pytorch/pytorch:2.4.0-cuda12.4-cudnn9-runtime python -c "import torch; torch.cuda.init()" ``` The error is: ``` Traceback (most recent call last): File "/test.py", line 3, in <module> torch.cuda.init() File "/opt/conda/lib/python3.10/site-packages/torch/cuda/__init__.py", line 260, in init _lazy_init() File "/opt/conda/lib/python3.10/site-packages/torch/cuda/__init__.py", line 293, in _lazy_init torch._C._cuda_init() RuntimeError: No CUDA GPUs are available ``` At the same time gvisor's debug logs show `nvproxy: unknown control command 0xcb33010c`. FUTURE_COPYBARA_INTEGRATE_REVIEW=#10824 from derpsteb:ob/key-rotation 960c2d0 PiperOrigin-RevId: 668003601
- Loading branch information