You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The primary culprit was the upstream PR vllm-project/vllm#3977, which drastically changed how quantized layers were handled. This made working with exllamav2 extremely difficult. If someone can make the existing exl2 quantization work with the changes from that PR, it should be easier to manage.
🚀 The feature, motivation and pitch
Why was exl2 support dropped?
Is there anything that the community can help with that is stuck?
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: