You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llama.cpp recently added support for Qwen2VL, which means that we can now quantize Qwen2VL models (and I've done so, successfully!) I'd like to be able to load quantized Qwen2VL models with AutoModelForVision2Seq; currently, transformers doesn't recognize qwen2vl as a valid architecture.
Motivation
It would be wonderful to be able to use quantized GGUF Qwen2VL models!
Your contribution
I'm happy to work up the PR for this, if I can get some direction on where to start. I'm hacking through the code right now, but I don't know it well enough to be able to meaningfully dent the problem just yet.
The text was updated successfully, but these errors were encountered:
Hi @cheald! You could check previous GGUF implementations for other models following this task #33260. There is well described workflow and a lot of different PRs. If you will have any questions, feel free to ask.
Feature request
llama.cpp recently added support for Qwen2VL, which means that we can now quantize Qwen2VL models (and I've done so, successfully!) I'd like to be able to load quantized Qwen2VL models with AutoModelForVision2Seq; currently, transformers doesn't recognize qwen2vl as a valid architecture.
Motivation
It would be wonderful to be able to use quantized GGUF Qwen2VL models!
Your contribution
I'm happy to work up the PR for this, if I can get some direction on where to start. I'm hacking through the code right now, but I don't know it well enough to be able to meaningfully dent the problem just yet.
The text was updated successfully, but these errors were encountered: