Skip to content

Commit

Permalink
Arcee.ai LLM & Retriever integration (#11579)
Browse files Browse the repository at this point in the history
- **Description:** This PR introduces a new LLM and Retriever API to
https://arcee.ai for the python client
  - **Issue:** implements the integrations as requested in #11578 ,
  - **Dependencies:** no dependencies are required,
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** shwooobham 


**✅ `make format`, `make lint` and `make test` runs locally.**
```shell
=========== 1245 passed, 277 skipped, 20 warnings in 16.26s ===========
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run black . --check
All done! ✨ 🍰 ✨
1818 files would be left unchanged.
[ "." = "" ] || poetry run mypy .
Success: no issues found in 1815 source files
[ "." = "" ] || poetry run black .
All done! ✨ 🍰 ✨
1818 files left unchanged.
[ "." = "" ] || poetry run ruff --select I --fix .
poetry run codespell --toml pyproject.toml
poetry run codespell --toml pyproject.toml -w
```


**Contributions**
1. Arcee (langchain/llms), ArceeRetriever (langchain/retrievers),
ArceeWrapper (langchain/utilities)
2. docs for Arcee (llms/arcee.py) and
ArceeRetriever(retrievers/arcee.py)
3.

cc: @Jacobsolawetz @Ben-Epstein

---------

Co-authored-by: Shubham <[email protected]>
  • Loading branch information
EricLiclair and Shubham authored Oct 10, 2023
1 parent b6a2507 commit 49de862
Show file tree
Hide file tree
Showing 8 changed files with 773 additions and 0 deletions.
146 changes: 146 additions & 0 deletions docs/docs_skeleton/docs/integrations/llms/arcee.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Arcee\n",
"This notebook demonstrates how to use the `Arcee` class for generating text using Arcee's Domain Adapted Language Models (DALMs)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup\n",
"\n",
"Before using Arcee, make sure the Arcee API key is set as `ARCEE_API_KEY` environment variable. You can also pass the api key as a named parameter."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import Arcee\n",
"\n",
"# Create an instance of the Arcee class\n",
"arcee = Arcee(\n",
" model=\"DALM-PubMed\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\" # if not already set in the environment\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional Configuration\n",
"\n",
"You can also configure Arcee's parameters such as `arcee_api_url`, `arcee_app_url`, and `model_kwargs` as needed.\n",
"Setting the `model_kwargs` at the object initialization uses the parameters as default for all the subsequent calls to the generate response."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"arcee = Arcee(\n",
" model=\"DALM-Patent\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\", # if not already set in the environment\n",
" arcee_api_url=\"https://custom-api.arcee.ai\", # default is https://api.arcee.ai\n",
" arcee_app_url=\"https://custom-app.arcee.ai\", # default is https://app.arcee.ai\n",
" model_kwargs={\n",
" \"size\": 5,\n",
" \"filters\": [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Einstein\"\n",
" }\n",
" ]\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generating Text\n",
"\n",
"You can generate text from Arcee by providing a prompt. Here's an example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Generate text\n",
"prompt = \"Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?\"\n",
"response = arcee(prompt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional parameters\n",
"\n",
"Arcee allows you to apply `filters` and set the `size` (in terms of count) of retrieved document(s) to aid text generation. Filters help narrow down the results. Here's how to use these parameters:\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define filters\n",
"filters = [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Einstein\"\n",
" },\n",
" {\n",
" \"field_name\": \"year\",\n",
" \"filter_type\": \"strict_search\",\n",
" \"value\": \"1905\"\n",
" }\n",
"]\n",
"\n",
"# Generate text with filters and size params\n",
"response = arcee(prompt, size=5, filters=filters)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
141 changes: 141 additions & 0 deletions docs/docs_skeleton/docs/integrations/retrievers/arcee.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Arcee Retriever\n",
"This notebook demonstrates how to use the `ArceeRetriever` class to retrieve relevant document(s) for Arcee's Domain Adapted Language Models (DALMs)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup\n",
"\n",
"Before using `ArceeRetriever`, make sure the Arcee API key is set as `ARCEE_API_KEY` environment variable. You can also pass the api key as a named parameter."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.retrievers import ArceeRetriever\n",
"\n",
"retriever = ArceeRetriever(\n",
" model=\"DALM-PubMed\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\" # if not already set in the environment\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional Configuration\n",
"\n",
"You can also configure `ArceeRetriever`'s parameters such as `arcee_api_url`, `arcee_app_url`, and `model_kwargs` as needed.\n",
"Setting the `model_kwargs` at the object initialization uses the filters and size as default for all the subsequent retrievals."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever = ArceeRetriever(\n",
" model=\"DALM-PubMed\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\", # if not already set in the environment\n",
" arcee_api_url=\"https://custom-api.arcee.ai\", # default is https://api.arcee.ai\n",
" arcee_app_url=\"https://custom-app.arcee.ai\", # default is https://app.arcee.ai\n",
" model_kwargs={\n",
" \"size\": 5,\n",
" \"filters\": [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Einstein\"\n",
" }\n",
" ]\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Retrieving documents\n",
"You can retrieve relevant documents from uploaded contexts by providing a query. Here's an example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?\"\n",
"documents = retriever.get_relevant_documents(query=query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional parameters\n",
"\n",
"Arcee allows you to apply `filters` and set the `size` (in terms of count) of retrieved document(s). Filters help narrow down the results. Here's how to use these parameters:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define filters\n",
"filters = [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Music\"\n",
" },\n",
" {\n",
" \"field_name\": \"year\",\n",
" \"filter_type\": \"strict_search\",\n",
" \"value\": \"1905\"\n",
" }\n",
"]\n",
"\n",
"# Retrieve documents with filters and size params\n",
"documents = retriever.get_relevant_documents(query=query, size=5, filters=filters)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
10 changes: 10 additions & 0 deletions libs/langchain/langchain/llms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@ def _import_anyscale() -> Any:
return Anyscale


def _import_arcee() -> Any:
from langchain.llms.arcee import Arcee

return Arcee


def _import_aviary() -> Any:
from langchain.llms.aviary import Aviary

Expand Down Expand Up @@ -479,6 +485,8 @@ def __getattr__(name: str) -> Any:
return _import_anthropic()
elif name == "Anyscale":
return _import_anyscale()
elif name == "Arcee":
return _import_arcee()
elif name == "Aviary":
return _import_aviary()
elif name == "AzureMLOnlineEndpoint":
Expand Down Expand Up @@ -633,6 +641,7 @@ def __getattr__(name: str) -> Any:
"AmazonAPIGateway",
"Anthropic",
"Anyscale",
"Arcee",
"Aviary",
"AzureMLOnlineEndpoint",
"AzureOpenAI",
Expand Down Expand Up @@ -713,6 +722,7 @@ def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
"amazon_bedrock": _import_bedrock,
"anthropic": _import_anthropic,
"anyscale": _import_anyscale,
"arcee": _import_arcee,
"aviary": _import_aviary,
"azure": _import_azure_openai,
"azureml_endpoint": _import_azureml_endpoint,
Expand Down
Loading

0 comments on commit 49de862

Please sign in to comment.