-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError: data parallel group is already initialized #549
Comments
Solved ! I initialized two LLM models for different service. |
Not working that way in google colab? Which platform you used for coding? |
I am having the same issue. I work in a virtual environment (venv) in WSL2. How can I resolve this issue? |
@mzeidhassan |
@YuamLu , I'm also getting the same error while using vLLm on langchain. I'm running on Azure Ubuntu Virtual Machine from langchain.llms import VLLM Error : ValidationError: 1 validation error for VLLM Please let me know how you solved in more detail |
I've updated a pull #817 , you can try on my code. If you still have error, paste the full error message to here, i'll try my best to solve it. |
@YuamLu , |
I found a fix in my own use case, which does not involve changing the source code:
Hope that helps others 🙂 |
Thanks for the quick solution. Really helpful, I additionally got a CUDA OOM Error (classic). So I thought to add an extended solution. Hope it helps too. # Add the same code shown by @saattrupdan
import torch
from vllm import LLM
from vllm.model_executor.parallel_utils.parallel_state import destroy_model_parallel
# Initialise a vLLM model for the first time
model = LLM(model="test-model-name", trust_remote_code=True)
# This vLLM function resets the global variables, which enables initializing models
destroy_model_parallel()
# If you face CUDA OOM Error, then delete all the left over queued operations
del model
torch.cuda.synchronize()
# Now, Re-initialise a new vLLM model
model = LLM(model="test-model-name", trust_remote_code=True) |
Fixes issue with multi LoRA during `profile_run`.
Hi, when doing inference on a single-gpu, i encountered this assertion error.
It happens when runing at vllm/model_executor/parallel_utils/parallel_state.py. I do not know why vllm need to init_distributed_environment while i only use single-gpu.
The text was updated successfully, but these errors were encountered: