autoawq for vllm #13343

drewskidang · 2023-11-14T10:47:09Z

Feature request

Is there a way to autoawq support for vllm, Im 'setting quantization to 'awq' but its not working

Motivation

faster inference

Your contribution

N/A

dosubot · 2023-11-14T10:47:31Z

🤖

Hello @drewskidang,

I'm Dosu, a bot here to assist you with your LangChain concerns while we wait for a human maintainer. Whether it's troubleshooting bugs, answering questions, or guiding new contributors, I'm here to help. Now, let's take a closer look at your issue regarding 'autoawq' support for 'vllm'. I'll get back to you with a comprehensive answer shortly.

dosubot · 2023-11-14T10:59:55Z

🤖

I understand that you're trying to set the quantization to 'awq' for faster inference, but it's not working. Based on the information available in the LangChain repository, there was a similar issue related to VLLM which was resolved (Issue #8729). However, there's no specific information about the 'awq' quantization in the LangChain codebase.

The 'vllm' class in LangChain is a powerful tool for handling language model operations, as detailed in the vllm.py file. However, without more information about the exact error message you're encountering when trying to set quantization to 'awq', it's difficult to provide a precise solution.

Could you please provide more details about the error you're encountering? This could include the exact error message, the code snippet where the error occurs, and the steps you've taken to try to resolve the issue. This information will help in diagnosing the problem and providing a more accurate solution.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

mspronesti · 2023-11-26T19:16:48Z

You need to pass it as a kwarg. Check this comment of mine on the official repo: vllm-project/vllm#1162 (comment)

dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:improvement Medium size change to existing code to handle new use-cases labels Nov 14, 2023

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 25, 2024

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 13, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autoawq for vllm #13343

autoawq for vllm #13343

drewskidang commented Nov 14, 2023

dosubot bot commented Nov 14, 2023

dosubot bot commented Nov 14, 2023

mspronesti commented Nov 26, 2023 •

edited

Loading

autoawq for vllm #13343

autoawq for vllm #13343

Comments

drewskidang commented Nov 14, 2023

Feature request

Motivation

Your contribution

dosubot bot commented Nov 14, 2023

dosubot bot commented Nov 14, 2023

Sources

mspronesti commented Nov 26, 2023 • edited Loading

mspronesti commented Nov 26, 2023 •

edited

Loading