Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to sovle warning:The context length of the model is too short to hold the multi-modal embeddings in the worst case #738

Open
vefalun opened this issue Feb 8, 2025 · 2 comments

Comments

@vefalun
Copy link

vefalun commented Feb 8, 2025

When I run vllm based on the code example in the readme file on an 8-card A100, the following warning occurs: (VllmWorkerProcess pid=427033) WARNING 02-08 11:44:42 profiling.py:187] The context length (128000) of the model is too short to hold the multi-modal embeddings in the worst case (131072 tokens in total, out of which {'image': 16384, 'video': 114688} are reserved for multi-modal embeddings). This may cause certain multi-modal inputs to fail during inference, even when the input text is short. To avoid this, you should increase max_model_len, reduce max_num_seqs, and/or reduce mm_counts. However, I couldn't find the configurations for max_model_len, max_num_seqs, and mm_countsin theconfig.json` file. How should I adjust these settings to avoid this warning? Thank you very much!

Image
@luosting
Copy link

luosting commented Feb 8, 2025

I meet the same problem, this size of tokens generated by new version is larger than the previous one.

@wulipc
Copy link

wulipc commented Feb 12, 2025

Hi, thanks for your interest in the Qwen model! This warning appears during the VLLM profile_run. In the original code, we added +1 to the video's num_frames in the dummy_data step to avoid having an odd number of frames. This resulted in generating more tokens than allowed (the context length ). This issue has been fixed in the latest VLLM code! Check out the details in this PR. This warning won't affect your actual inference, so no worries there. If it bothers you, you can update to the fixed version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants