-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with vLLM docker container vllm/vllm-openai:v0.3.0
#2773
Comments
Also having the issue with docker deployment of vLLM. I have pulled the v0.3.0 image from dockerhub and created the container with following options
It worked just fine with v0.2.7 of vllm with following docker logs
However, with the same environment v0.3.0 would raise CUDA error with following docker logs:
My server has 'RTX 6000 Ada Generation D6 48GB', which supports compute capability with 11.8, 12.0 - 12.4, therefore I think it wouldn't be an issue with GPU. I guess it's the compatibility issue with Ray, as the Ray version requirement was updated at Jan 29, 2024 in commit 7d64841 following an issue #2636. I'd really appreciate it if you could take a look at this issue along with @sarahwooders's issue. |
Solved by #2845 |
I am trying to deploy vLLM on k8 with the following deployment YAML
Although this deployment worked fine with previous versions of the vLLM docker container, on the latest version, the following error occurs after the model is finished loading:
The text was updated successfully, but these errors were encountered: