[Bug]: loading qwen2-vl-7b fails with error: `assert "factor" in rope_scaling` #8388

abacaj · 2024-09-12T00:54:52Z

Your current environment

The output of `python collect_env.py`

Versions of relevant libraries:
[pip3] flake8==6.0.0
[pip3] lion-pytorch==0.1.2
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.23.5
[pip3] nvidia-cublas-cu11==11.10.3.66
[pip3] nvidia-cublas-cu12==12.1.3.1
[pip3] nvidia-cuda-cupti-cu11==11.7.101
[pip3] nvidia-cuda-cupti-cu12==12.1.105
[pip3] nvidia-cuda-nvrtc-cu11==11.7.99
[pip3] nvidia-cuda-nvrtc-cu12==12.1.105
[pip3] nvidia-cuda-runtime-cu11==11.7.99
[pip3] nvidia-cuda-runtime-cu12==12.1.105
[pip3] nvidia-cudnn-cu11==8.5.0.96
[pip3] nvidia-cudnn-cu12==9.1.0.70
[pip3] nvidia-cufft-cu11==10.9.0.58
[pip3] nvidia-cufft-cu12==11.0.2.54
[pip3] nvidia-curand-cu11==10.2.10.91
[pip3] nvidia-curand-cu12==10.3.2.106
[pip3] nvidia-cusolver-cu11==11.4.0.1
[pip3] nvidia-cusolver-cu12==11.4.5.107
[pip3] nvidia-cusparse-cu11==11.7.4.91
[pip3] nvidia-cusparse-cu12==12.1.0.106
[pip3] nvidia-ml-py==12.555.43
[pip3] nvidia-nccl-cu11==2.14.3
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] nvidia-nvjitlink-cu12==12.3.52
[pip3] nvidia-nvtx-cu11==11.7.91
[pip3] nvidia-nvtx-cu12==12.1.105
[pip3] pynvml==11.5.0
[pip3] pyzmq==25.1.0
[pip3] sentence-transformers==2.2.2
[pip3] torch==2.4.0
[pip3] torchvision==0.19.0
[pip3] transformers==4.45.0.dev0
[pip3] transformers-stream-generator==0.0.4
[pip3] triton==3.0.0
[pip3] vllm-nccl-cu12==2.18.1.0.3.0
[conda] Could not collect
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.6.1@3fd2b0d21cd9ec78de410fdf8aa1de840e9ad77a
vLLM Build Flags

🐛 Describe the bug

Traceback (most recent call last):
  File "/home/anton/personal/transformer-experiments/inference/vllm_multi.py", line 21, in <module>
    run_server(args)
  File "/home/anton/personal/transformer-experiments/inference/vllm_multi.py", line 9, in run_server
    llm = load_model(args.model, 8192, args.gpu)
  File "/home/anton/personal/transformer-experiments/inference/model.py", line 19, in load_model
    engine = AsyncLLMEngine.from_engine_args(AsyncEngineArgs(
  File "/home/anton/personal/transformer-experiments/env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 726, in from_engine_args
    engine_config = engine_args.create_engine_config()
  File "/home/anton/personal/transformer-experiments/env/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 844, in create_engine_config
    model_config = self.create_model_config()
  File "/home/anton/personal/transformer-experiments/env/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 782, in create_model_config
    return ModelConfig(
  File "/home/anton/personal/transformer-experiments/env/lib/python3.10/site-packages/vllm/config.py", line 227, in __init__
    self.max_model_len = _get_and_verify_max_len(
  File "/home/anton/personal/transformer-experiments/env/lib/python3.10/site-packages/vllm/config.py", line 1739, in _get_and_verify_max_len
    assert "factor" in rope_scaling

The recent qwen2-vl merge added a check for rope_type -> if rope_type == "mrope" : 3b7fea7#diff-7eaad0b7dee0626bf29d10081b0f0c5e3ea15a4af97e7b182a4e0d35f8346953R1736

But huggingface is overriding this key to be set to "default" for some reason:

            if self.rope_scaling["type"] == "mrope":
                self.rope_scaling["type"] = "default"

https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_vl/configuration_qwen2_vl.py#L240

Do you know what is correct way to load model?

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

SHRISH01 · 2024-09-12T01:10:31Z

The specific issue is :

rope_scaling["type"] - key is being overridden to "default" even if it is initially set to "mrope".

Try :

if self.rope_scaling["type"] != "mrope":
self.rope_scaling["type"] = "default"

This way, the original value of "mrope" will be preserved, allowing the model to open correctly.

abacaj · 2024-09-12T01:17:50Z

The specific issue is :

rope_scaling["type"] - key is being overridden to "default" even if it is initially set to "mrope".

Try :

if self.rope_scaling["type"] != "mrope": self.rope_scaling["type"] = "default"

This way, the original value of "mrope" will be preserved, allowing the model to open correctly.

Uh is this an AI reply? Because the solution doesn't make sense...

DarkLight1337 · 2024-09-12T02:25:36Z

Which version of transformers are you using? It is a known bug in transformers so you need to use the specific version (not just any dev version) as mentioned in our docs.

abacaj · 2024-09-12T02:30:17Z

Which version of transformers are you using? It is a known bug in transformers so you need to use the specific version (not just any dev version) as mentioned in our docs.

Got it yea now I see it was a recent change to transformers (using main branch), thanks!

abacaj added the bug Something isn't working label Sep 12, 2024

DarkLight1337 closed this as completed Sep 12, 2024

This was referenced Oct 9, 2024

Enable qwen2-vl multimodal input on v0.6.1 analytics-zoo/vllm#43

Merged

Enable qwen2-vl multimodal on 062 analytics-zoo/vllm#44

Merged

hzjane mentioned this issue Jan 20, 2025

Add vllm 0.6.2 vision offline example intel/ipex-llm#12721

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: loading qwen2-vl-7b fails with error: `assert "factor" in rope_scaling` #8388

[Bug]: loading qwen2-vl-7b fails with error: `assert "factor" in rope_scaling` #8388

abacaj commented Sep 12, 2024 •

edited

Loading

SHRISH01 commented Sep 12, 2024

abacaj commented Sep 12, 2024

DarkLight1337 commented Sep 12, 2024 •

edited

Loading

abacaj commented Sep 12, 2024 •

edited

Loading

[Bug]: loading qwen2-vl-7b fails with error: assert "factor" in rope_scaling #8388

[Bug]: loading qwen2-vl-7b fails with error: assert "factor" in rope_scaling #8388

Comments

abacaj commented Sep 12, 2024 • edited Loading

Your current environment

🐛 Describe the bug

Before submitting a new issue...

SHRISH01 commented Sep 12, 2024

abacaj commented Sep 12, 2024

DarkLight1337 commented Sep 12, 2024 • edited Loading

abacaj commented Sep 12, 2024 • edited Loading

[Bug]: loading qwen2-vl-7b fails with error: `assert "factor" in rope_scaling` #8388

[Bug]: loading qwen2-vl-7b fails with error: `assert "factor" in rope_scaling` #8388

abacaj commented Sep 12, 2024 •

edited

Loading

DarkLight1337 commented Sep 12, 2024 •

edited

Loading

abacaj commented Sep 12, 2024 •

edited

Loading