Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autoawq for vllm #13343

Closed
drewskidang opened this issue Nov 14, 2023 · 3 comments
Closed

autoawq for vllm #13343

drewskidang opened this issue Nov 14, 2023 · 3 comments
Labels
🤖:improvement Medium size change to existing code to handle new use-cases Ɑ: models Related to LLMs or chat model modules

Comments

@drewskidang
Copy link

Feature request

Is there a way to autoawq support for vllm, Im 'setting quantization to 'awq' but its not working

Motivation

faster inference

Your contribution

N/A

Copy link

dosubot bot commented Nov 14, 2023

🤖

Hello @drewskidang,

I'm Dosu, a bot here to assist you with your LangChain concerns while we wait for a human maintainer. Whether it's troubleshooting bugs, answering questions, or guiding new contributors, I'm here to help. Now, let's take a closer look at your issue regarding 'autoawq' support for 'vllm'. I'll get back to you with a comprehensive answer shortly.

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:improvement Medium size change to existing code to handle new use-cases labels Nov 14, 2023
Copy link

dosubot bot commented Nov 14, 2023

🤖

I understand that you're trying to set the quantization to 'awq' for faster inference, but it's not working. Based on the information available in the LangChain repository, there was a similar issue related to VLLM which was resolved (Issue #8729). However, there's no specific information about the 'awq' quantization in the LangChain codebase.

The 'vllm' class in LangChain is a powerful tool for handling language model operations, as detailed in the vllm.py file. However, without more information about the exact error message you're encountering when trying to set quantization to 'awq', it's difficult to provide a precise solution.

Could you please provide more details about the error you're encountering? This could include the exact error message, the code snippet where the error occurs, and the steps you've taken to try to resolve the issue. This information will help in diagnosing the problem and providing a more accurate solution.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@mspronesti
Copy link
Contributor

mspronesti commented Nov 26, 2023

You need to pass it as a kwarg. Check this comment of mine on the official repo: vllm-project/vllm#1162 (comment)

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 25, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 13, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

No branches or pull requests

2 participants