-
Notifications
You must be signed in to change notification settings - Fork 16.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autoawq for vllm #13343
Comments
🤖 Hello @drewskidang, I'm Dosu, a bot here to assist you with your LangChain concerns while we wait for a human maintainer. Whether it's troubleshooting bugs, answering questions, or guiding new contributors, I'm here to help. Now, let's take a closer look at your issue regarding 'autoawq' support for 'vllm'. I'll get back to you with a comprehensive answer shortly. |
🤖 I understand that you're trying to set the quantization to 'awq' for faster inference, but it's not working. Based on the information available in the LangChain repository, there was a similar issue related to VLLM which was resolved (Issue #8729). However, there's no specific information about the 'awq' quantization in the LangChain codebase. The 'vllm' class in LangChain is a powerful tool for handling language model operations, as detailed in the vllm.py file. However, without more information about the exact error message you're encountering when trying to set quantization to 'awq', it's difficult to provide a precise solution. Could you please provide more details about the error you're encountering? This could include the exact error message, the code snippet where the error occurs, and the steps you've taken to try to resolve the issue. This information will help in diagnosing the problem and providing a more accurate solution. SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
You need to pass it as a kwarg. Check this comment of mine on the official repo: vllm-project/vllm#1162 (comment) |
Feature request
Is there a way to autoawq support for vllm, Im 'setting quantization to 'awq' but its not working
Motivation
faster inference
Your contribution
N/A
The text was updated successfully, but these errors were encountered: