RAGMind is an open-source Retrieval-Augmented Generation (RAG) engine designed for in-depth document comprehension. It provides an optimized RAG workflow suitable for businesses of any size, leveraging Large Language Models (LLMs) to deliver accurate question-answering capabilities, supported by reliable citations from complexly formatted data sources.
⚠️ Warning: This project is currently under active development and may be unstable. It is not recommended for production use at this stage, as features and functionality are still evolving and subject to significant changes. Please use it for testing and development purposes only.
- Extracts knowledge from unstructured data with complex formats through deep document understanding.
- Locates the "needle in a data haystack," handling virtually limitless tokens.
- Visual text chunking enables human oversight.
- Quick access to key references and traceable citations, supporting accurate answers.
- Simplified RAG orchestration tailored for individuals and enterprises alike.
- Customizable LLMs and embedding models.
- Multi-step retrieval with enhanced re-ranking fusion.
- User-friendly APIs for smooth business integration.
- CPU >= 4 cores
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
If you have not installed Docker on your local machine (Windows, Mac, or Linux), see Install Docker Engine.
Recommendation: For optimal performance, it is recommended to install NVIDIA CUDA version 12.1 or higher. You can find installation instructions and further details in the official NVIDIA CUDA documentation.
-
Clone the repo:
git clone https://github.com/erofcon/ragmind.git
-
Go to project:
cd ragmind
-
Build the pre-built Docker images and start up the server:
cd docker docker compose -f docker-compose.yml up -d
-
Install requirements:
pip install -r requirements.txt
⚠️ Please ensure you are installing the correct version of PyTorch with CUDA support compatible with your system. Using an incorrect version may lead to errors or reduced performance.To install PyTorch with CUDA, use the following command, replacing
cu121
with the appropriate CUDA version:pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121For the latest installation commands, visit the official PyTorch page.
If you need to set up database migrations from scratch, use the following commands. However, the first two steps are optional if the migration setup already exists.
- Initialize Alembic with async support (optional if already set up):
alembic init -t async migrations
- Generate an initial migration script (optional if migrations already exist):
alembic revision --autogenerate -m "init"
- Apply the migrations to the database:
alembic upgrade head
Note: To run this project, you’ll need access to a Large Language Model (LLM), either by hosting one locally or by using an existing service like OpenAI's ChatGPT. For a local setup, you can explore tools like LM Studio as an example solution for deploying and managing language models on your own machine.
Configuration: LLM settings can be customized in the
settings.py
file to suit your specific requirements, whether you're using a local model or an external service.Dependency Notice: Before running the project, ensure you download the necessary model dependencies by executing the
download_deeps.py
script:python download_deeps.pyThis will fetch all required models to ensure the system functions properly.
To start the project, simply execute the following command:
python main.pyThis will launch the application with the current configuration settings.
Note: If you'd like to launch the web interface for this project, please follow the setup instructions in the dedicated repository here.
This project is open-source and distributed under the MIT License, which permits free use, modification, and distribution. For more details, please see the MIT License documentation.