Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem statment: There are alot of powerfull LLMs backend with terrible UI and alot of nice front end LLMw with bad/unscalable backend.
This PR will allow to use custom models from a private server hosting openai compatible API
Requirments:
a backend with openAI compatble API
here are some popular ones
FastChat, FastChat is an open platform for training, serving, and evaluating large language model based chatbots. Includes scalability capabilities
vLLM faster/scalable version of hosting LLMs
LMDeploy Less popular, but more faster/scalable version of vLLM
llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
text-generation-webui, the most popular web UI. Supports NVidia CUDA GPU acceleration.
LM Studio, a fully featured local GUI with GPU acceleration on both Windows (NVidia and AMD), and macOS.
ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server.
this will make libraries like [llama-gpt]https://github.com/getumbrel/llama-gpt
obsolete as the focus of this project is nice UI, but has a unscallable backend
Here is how to test it
I recommend vLLM due to its powerfull backend
To skip the 2 steps above test using my api server... Ill host it for a week or untill this PR is closed
https://major-collie-officially.ngrok-free.app/v1/chat/completions
API keyEMPTY
Enter in your credentials into the app
Choose a model supported by your API
Start chating on a local private server!!!