In addition to the general contribution guidelines, there are a few extra things to consider when contributing third-party integrations to LangChain that will be covered here. The goal of this page is to help you draft PRs that take these considerations into account, and can therefore be merged sooner.
Integrations tend to fall into a set number of categories, each of which will have their own section below. Please read the general guidelines, then see the integration-specific guidelines and example PRs section at the end of this page for additional information and examples.
The following guidelines apply broadly to all type of integrations:
You should generally not export your new module from an index.ts
file that contains many other exports. Instead, you should add a separate entrypoint for your integration in libs/langchain-community/langchain.config.js
within the entrypoints
field in the config object:
export const config = {
internals: [ ... ],
entrypoints: {
load: "load/index",
...
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
...
},
...
}
The entrypoint name should conform to its path in the repo. For example, if you were adding a new vector store for a hypothetical provider "langco", you might create it under vectorstores/langco.ts
. You should add it above as:
export const config = {
internals: [ ... ],
entrypoints: {
load: "load/index",
...
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
"vectorstores/langco": "vectorstores/langco",
...
},
...
}
A user would then import your new vector store as import { LangCoVectorStore } from "@langchain/community/vectorstores/langco";
.
You may use third-party dependencies in new integrations, but they should be added as peerDependencies
and devDependencies
with an entry under peerDependenciesMeta
in libs/langchain-community/package.json
, not under any core dependencies
list. This keeps the overall package size small, as only people who are using your integration will need to install, and allows us to support a wider range of runtimes.
We suggest using caret syntax (^
) for peer dependencies to support a wider range of people trying to use them as well as to be somewhat tolerant to non-major version updates, which should (theoretically) be the only breaking ones.
Please make sure all introduced dependencies are permissively licensed (MIT is recommended) and well-supported and maintained.
You must also add your new entrypoint under requiresOptionalDependency
in the langchain.config.js
file to avoid breaking the build:
export const config = {
internals: [ ... ],
entrypoints: {
load: "load/index",
...
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
"vectorstores/langco": "vectorstores/langco",
...
},
requiresOptionalDependency: [
...
"vectorstores/langco",
...
],
...
}
If you have conformed to all of the above guidelines, you can just import your dependency as normal in your integration's file in the LangChain repo. Developers who import your entrypoint will then see an error message if they are missing the required peer dependency.
Many integrations initialize instances of third-party clients, which often require vendor-specific configuration and options in addition to LangChain specific configuration. To avoid unnecessary repetition and desyncing, we suggest using imported third-party configuration types whenever available, unless there's a specific reason to only support a subset of these options.
Here's a simplified example:
import {
LangCoClient,
LangCoClientOptions,
} from "langco-client";
import { BaseDocumentLoader, DocumentLoader } from "../base.js";
export class LangCoDatasetLoader
extends BaseDocumentLoader
implements DocumentLoader
{
protected langCoClient: LangCoClient;
protected datasetId: string;
protected verbose: boolean;
constructor(
datasetId: string,
config: {
verbose: boolean;
clientOptions?: LangCoClientOptions;
}
) {
super();
this.langCoClient = new LangCoClient(config.clientOptions ?? {});
this.verbose = config.verbose ?? false;
}
...
}
Above, we have a document loader that we're sure will always require a specific datasetId
, and then some config
properties that could change in the future containing a LangChain specific configuration property, verbose
. We have also put a clientOptions
parameter within that config
that is passed directly into the third party client. With this structure, if the underlying client adds new options, all we need to do is bump the version.
We highly appreciate documentation and integration tests showing how to set up and use your integration. Providing this will make it much easier for reviewers to verify that your integration works and will streamline the review process.
New docs pages should be added as the appropriate template from here:
As with all contributions, make sure you run yarn lint
and yarn format
so that everything conforms to our established style.
While most integrations should generally reside in the libs/langchain-community
workspace and be imported as @langchain/community/module/name
, more in-depth integrations or suites of integrations may also reside in separate packages that depend on and extend @langchain/core
. See @langchain/google-genai
for an example.
To make creating packages like this easier, we offer the create-langchain-integration
utility that will automatically scaffold a repo with support for both ESM + CJS entrypoints. You can run it like this:
$ npx create-langchain-integration
The workflows and considerations for these packages are mostly the same as those in @langchain/community
, with the exception that third-party dependencies should be hard dependencies instead of peer dependencies since the end-user will manually install your integration package anyway.
You will need to make sure that your package is compatible with the current minor version of @langchain/core
in order for it to be interoperable with other integration packages and the latest versions of LangChain. We recommend using a tilde syntax for your integration package's @langchain/core
dependency to support a wider range of core patch versions.
Below are links to guides with advice and tips for specific types of integrations. These are currently out of date with the @langchain/community
split, but will give you a rough idea of what is necessary:
- LLM providers (e.g. OpenAI's GPT-3)
- Chat model providers (TODO) (e.g. Anthropic's Claude, OpenAI's GPT-4)
- Memory (used to give an LLM or chat model context of past conversations, e.g. Motörhead)
- Vector stores (e.g. Pinecone)
- Persistent message stores (used to persistently store and load raw chat histories, e.g. Redis)
- Document loaders (used to load documents for later storage into vector stores, e.g. Apify)
- Embeddings (used to create embeddings of text documents or strings e.g. Cohere)
- Tools (used for agents, e.g. the SERP API tool)
This is a living document, so please make a pull request if we're missing anything useful!