-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: TimeoutError During Benchmark Profiling with Torch Profiler on vLLM v0.6.0 #8326
Comments
|
you can set |
@robertgshaw2-neuralmagic Hi, added --disable-frontend-multiprocessing did not work for me. I think above infos are not full, below is my full info, the real reason is
|
I forgot to handle the multiproc case. Will make a PR. For now set |
For the timeout issue try setting the env var: |
meet the same error: |
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
I am attempting to profile the performance of vLLM v0.6.0 by following the vLLM profiling documentation.
Here’s the process I followed:
export VLLM_TORCH_PROFILER_DIR=/app/vllm_profile
While running the benchmark, I encountered a TimeoutError: Server didn't reply within 5000 ms in the vLLM OpenAI server log. The relevant section of the log is as follows:
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: