-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Supports LoRA for LLM and image models #1080
Conversation
Hello, I'm patrolling xinference in xinference 0.9.1 to load lora, below is my running code. from xinference.client import Client client = Client("http://120.48.137.80:9997") There are no parameters “peft_model_path” displayed when the model is launched and I looked at the source code and it really doesn't have this parameter |
This PR will be merged for the release of |
Xinference loads the Qwen LoRa model without VLLM acceleration |
Currently xinference does not support LoRA for vLLM. |
Related #389
Fixes #271
Fixes #840
Fixes #1041
Example
Image model
sdxl-turbo
LLM model
qwen-chat
7B:The answer is:
Without lora, qwen-chat 7B's Korean language ability is very weak. With Korean-enhanced lora, Korean answers are very smooth.