You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
INFO 10-10 00:56:44 api_server.py:164] Multiprocessing frontend to use ipc:///tmp/6f288ab9-add1-4cfb-a217-af1687e882b5 for IPC Path.
qwen72-1 | INFO 10-10 00:56:44 api_server.py:177] Started engine process with PID 36
qwen72-1 | Unrecognized keys in rope_scaling for 'rope_type'='default': {'mrope_section'}
qwen72-1 | Traceback (most recent call last):
qwen72-1 | File "", line 198, in _run_module_as_main
qwen72-1 | File "", line 88, in _run_code
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 571, in
qwen72-1 | uvloop.run(run_server(args))
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 109, in run
qwen72-1 | return __asyncio.run(
qwen72-1 | ^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
qwen72-1 | return runner.run(main)
qwen72-1 | ^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
qwen72-1 | return self._loop.run_until_complete(task)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 61, in wrapper
qwen72-1 | return await main
qwen72-1 | ^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 538, in run_server
qwen72-1 | async with build_async_engine_client(args) as engine_client:
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 105, in build_async_engine_client
qwen72-1 | async with build_async_engine_client_from_engine_args(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 182, in build_async_engine_client_from_engine_args
qwen72-1 | engine_config = engine_args.create_engine_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 874, in create_engine_config
qwen72-1 | model_config = self.create_model_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 811, in create_model_config
qwen72-1 | return ModelConfig(
qwen72-1 | ^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 207, in init
qwen72-1 | self.max_model_len = _get_and_verify_max_len(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 1746, in _get_and_verify_max_len
qwen72-1 | assert "factor" in rope_scaling
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | AssertionError
Before submitting a new issue...
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
Transformers v4.45 has a bug where Qwen2-VL config cannot be loaded correctly. Please either downgrade to vLLM 0.6.1 to use Transformers v4.44, or install vLLM from source to use a patched version of Qwen2-VL config.
I downgraded to v0.6.1.post2, then I tried v0.6.1 and this brings up another Error:
ValueError: The checkpoint you are trying to load has model type qwen2_vl but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
So I'll gonna wait till it's fixed in the next Version.
Try this docker image soloking/vllm-openai:v0.6.1 or build a image with below dockerfile, it works for me.
FROM docker.io/vllm/vllm-openai:v0.6.1
RUN python3 -m pip install -U git+https://github.com/huggingface/transformers.git@21fac7abba2a37fae86106f87fcf9974fd1e3830
RUN pip install -U flash-attn --no-build-isolation
Your current environment
I'm using vLLM-Docker latest (0.6.2)
Model Input Dumps
No response
🐛 Describe the bug
INFO 10-10 00:56:44 api_server.py:164] Multiprocessing frontend to use ipc:///tmp/6f288ab9-add1-4cfb-a217-af1687e882b5 for IPC Path.
qwen72-1 | INFO 10-10 00:56:44 api_server.py:177] Started engine process with PID 36
qwen72-1 | Unrecognized keys in
rope_scaling
for 'rope_type'='default': {'mrope_section'}qwen72-1 | Traceback (most recent call last):
qwen72-1 | File "", line 198, in _run_module_as_main
qwen72-1 | File "", line 88, in _run_code
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 571, in
qwen72-1 | uvloop.run(run_server(args))
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 109, in run
qwen72-1 | return __asyncio.run(
qwen72-1 | ^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
qwen72-1 | return runner.run(main)
qwen72-1 | ^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
qwen72-1 | return self._loop.run_until_complete(task)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 61, in wrapper
qwen72-1 | return await main
qwen72-1 | ^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 538, in run_server
qwen72-1 | async with build_async_engine_client(args) as engine_client:
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 105, in build_async_engine_client
qwen72-1 | async with build_async_engine_client_from_engine_args(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 182, in build_async_engine_client_from_engine_args
qwen72-1 | engine_config = engine_args.create_engine_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 874, in create_engine_config
qwen72-1 | model_config = self.create_model_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 811, in create_model_config
qwen72-1 | return ModelConfig(
qwen72-1 | ^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 207, in init
qwen72-1 | self.max_model_len = _get_and_verify_max_len(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 1746, in _get_and_verify_max_len
qwen72-1 | assert "factor" in rope_scaling
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | AssertionError
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: