Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(llm): Add openailike llm mode #1447

Merged
merged 2 commits into from
Dec 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion fern/docs/pages/manual/llms.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ llm:
mode: openai

openai:
api_base: <openai-api-base-url> # Defaults to https://api.openai.com/v1
api_key: <your_openai_api_key> # You could skip this configuration and use the OPENAI_API_KEY env var instead
model: <openai_model_to_use> # Optional model to use. Default is "gpt-3.5-turbo"
# Note: Open AI Models are listed here: https://platform.openai.com/docs/models
Expand All @@ -55,6 +56,24 @@ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:80
You'll notice the speed and quality of response is higher, given you are using OpenAI's servers for the heavy
computations.

### Using OpenAI compatible API

Many tools, including [LocalAI](https://localai.io/) and [vLLM](https://docs.vllm.ai/en/latest/),
support serving local models with an OpenAI compatible API. Even when overriding the `api_base`,
using the `openai` mode doesn't allow you to use custom models. Instead, you should use the `openailike` mode:

```yaml
llm:
mode: openailike
```

This mode uses the same settings as the `openai` mode.

As an example, you can follow the [vLLM quickstart guide](https://docs.vllm.ai/en/latest/getting_started/quickstart.html#openai-compatible-server)
to run an OpenAI compatible server. Then, you can run PrivateGPT using the `settings-vllm.yaml` profile:

`PGPT_PROFILES=vllm make run`

### Using AWS Sagemaker

For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.
Expand Down Expand Up @@ -82,4 +101,4 @@ or
`PGPT_PROFILES=sagemaker poetry run python -m private_gpt`

When the server is started it will print a log *Application startup complete*.
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
16 changes: 15 additions & 1 deletion private_gpt/components/llm/llm_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,21 @@ def __init__(self, settings: Settings) -> None:

openai_settings = settings.openai
self.llm = OpenAI(
api_key=openai_settings.api_key, model=openai_settings.model
api_base=openai_settings.api_base,
api_key=openai_settings.api_key,
model=openai_settings.model,
)
case "openailike":
from llama_index.llms import OpenAILike

openai_settings = settings.openai
self.llm = OpenAILike(
api_base=openai_settings.api_base,
api_key=openai_settings.api_key,
model=openai_settings.model,
is_chat_model=True,
max_tokens=None,
api_version="",
)
case "mock":
self.llm = MockLLM()
6 changes: 5 additions & 1 deletion private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ class DataSettings(BaseModel):


class LLMSettings(BaseModel):
mode: Literal["local", "openai", "sagemaker", "mock"]
mode: Literal["local", "openai", "openailike", "sagemaker", "mock"]
max_new_tokens: int = Field(
256,
description="The maximum number of token that the LLM is authorized to generate in one completion.",
Expand Down Expand Up @@ -156,6 +156,10 @@ class SagemakerSettings(BaseModel):


class OpenAISettings(BaseModel):
api_base: str = Field(
None,
description="Base URL of OpenAI API. Example: 'https://api.openai.com/v1'.",
)
api_key: str
model: str = Field(
"gpt-3.5-turbo",
Expand Down
14 changes: 14 additions & 0 deletions settings-vllm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
llm:
mode: openailike

embedding:
mode: local
ingest_mode: simple

local:
embedding_hf_model_name: BAAI/bge-small-en-v1.5

openai:
api_base: http://localhost:8000/v1
api_key: EMPTY
model: facebook/opt-125m
Loading