You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
ERROR 09-13 03:41:19 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 138 died, exit code: -15
INFO 09-13 03:41:19 multiproc_worker_utils.py:123] Killing local vLLM worker processes
Process SpawnProcess-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 328, in forward
x = x + self.mlp(self.norm2(x)) RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
ERROR 09-13 03:41:19 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 138 died, exit code: -15
INFO 09-13 03:41:19 multiproc_worker_utils.py:123] Killing local vLLM worker processes
Process SpawnProcess-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 328, in forward
x = x + self.mlp(self.norm2(x))
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
docker run --gpus all -it --shm-size=64g --privileged --name qwen2vllm-2 --network="host" -v $(pwd):/app qwenllm/qwenvl:latest
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-2B-Instruct --model /app/qwen/Qwen2-VL-2B-Instruct --max-model-len 16384 --dtype half --tensor-parallel-size 4
运行环境 | Environment
备注 | Anything else?
No response
The text was updated successfully, but these errors were encountered: