You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Right now only OpenAI embeddings are supported, which requires a call to OpenAI to embed and persist memories into pgvector
Describe the solution you'd like
This will be property driven, depending on the existing environment variable USE_OPENAI_EMBEDDING, either OpenAI embeddings API will be used (current), or local BGESmallENV15 from fastembed
Describe alternatives you've considered
BGELarge is also a possibility, but BGE-Small is ~125MB vs ~420MB for BGE-Large, and BGE-Small performs at ~95-97% of BGE-Large's quality on most benchmarks. BGE-Small is about 70-80% of OpenAI's embedding quality with no API cost and no latency.
Additional context
I have this running locally on supabase, and will cherry-pick out the changes for a new PR for this issue.
The text was updated successfully, but these errors were encountered:
ok i tried for 2 days to create a clean PR for you guys, but couldn't do it. Too many changes. Here's the commit with my changes that adds support for BGE/384 local embeddings
Heres the details - check the .env.example file for this section:
# Feature Flags
IMAGE_GEN= # Set to TRUE to enable image generation
USE_OPENAI_EMBEDDING= # Set to TRUE for OpenAI/1536, leave blank for local
USE_OLLAMA_EMBEDDING= # Set to TRUE for OLLAMA/1024, leave blank for local
If using OLLAMA, then set USE_OLLAMA_EMBEDDING=true and USE_OPENAI_EMBEDDING blank
If you want to use BGE/384 (local embeddings using BAAI General Embeddings small model)
Then leave both USE_OLLAMA_EMBEDDING and USE_OPENAI_EMBEDDING blank.
These embeddings are for persisting and searching the memories table and are separate and distinct from LLM inference, and can be mixed and matched (I use Anthropic for inference, BGE for embeddings)
Is your feature request related to a problem? Please describe.
Right now only OpenAI embeddings are supported, which requires a call to OpenAI to embed and persist memories into pgvector
Describe the solution you'd like
This will be property driven, depending on the existing environment variable USE_OPENAI_EMBEDDING, either OpenAI embeddings API will be used (current), or local BGESmallENV15 from fastembed
Describe alternatives you've considered
BGELarge is also a possibility, but BGE-Small is ~125MB vs ~420MB for BGE-Large, and BGE-Small performs at ~95-97% of BGE-Large's quality on most benchmarks. BGE-Small is about 70-80% of OpenAI's embedding quality with no API cost and no latency.
Additional context
I have this running locally on supabase, and will cherry-pick out the changes for a new PR for this issue.
The text was updated successfully, but these errors were encountered: