-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA error: no kernel image is available for execution on the device #147
Comments
I ran the installer again and there were not the same deprecation warnings, the output was:
In particular the line |
Ah wait. The compilation job and the training job landed on different machines in the cluster with different GPU models. Probably the kernel has to be compiled for the actual GPU it will be used with... |
Yes, it has to be compiled for the specific GPU. Sometimes there can be issues with versions managed in a cluster because of this. I recommend trying to create a separate environment for each machine type. |
Hi there,
I have copied s4.py and the kernel extension into another repository I am working on. I had S4 components running (with CUDA), and then I installed the kernel extensions. The build output was full of deprecation warnings so filled my terminal history, but ends with
Also, I remember at the beginning some warnings because CUDA version is 12.3 but pytorch is built for 12.1...
In any case, when I now try to train with the S4 components, I get an error like:
Any idea how to work around this?
Thanks...
The text was updated successfully, but these errors were encountered: