Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support qwenvl model for HPU #793

Open
wants to merge 1 commit into
base: habana_main
Choose a base branch
from

Conversation

yingjie-han
Copy link

@yingjie-han yingjie-han commented Feb 7, 2025

This PR aims to support qwenvl vision infer on HPU.

Issue to solve

The function merge_multimodal_embeddings() in utils.py has dynamic problem on HPU.

Solution

Flatten the embeddings tensor , and use index_put_() to merge the multimodal embeddings in qwen.py instead of calling merge_multimodal_embeddings() in utils.py.

Test

Single image
python examples/offline_inference/vision_language.py -m qwen_vl

Multiple images
python examples/offline_inference/vision_language_multi_image.py -m qwen_vl_chat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant