A simplified RAG (Retrieval-Augmented Generation) system using Flask and Python, featuring in-memory vector storage, configurable LLM providers, and real-time search capabilities.
This project serves as a demonstration of using the Replit AI Agent for full-stack application development. It showcases how AI-assisted development can efficiently create a functional RAG system with various features and integrations.
The project was developed iteratively following these key steps:
-
Initial Prototype
- Set up Flask web application
- Implemented basic PDF upload and text extraction
- Created initial vector storage mechanism
- Established basic search functionality
-
PostgreSQL/pgvector Integration
- Integrated persistent vector storage
- Implemented database schema for documents and embeddings
- Added vector similarity search capabilities
-
Advanced Chunking Strategies
- Developed semantic-aware text chunking
- Implemented overlap between chunks
- Added metadata retention for context
-
Multiple Document Support
- Added multi-file upload capability
- Implemented document management system
- Created document listing and deletion features
-
Real-time Frontend Features
- Added search suggestions with debouncing
- Implemented result highlighting
- Created loading indicators
- Enhanced user interface responsiveness
-
Documentation
- Created comprehensive documentation
- Added usage instructions
- Documented API and configurations
The development process was highly interactive and iterative:
-
Feature Implementation
- Features were developed incrementally with continuous feedback
- Each component was tested individually before integration
- Improvements were made based on user feedback
-
Testing Process
- Upload functionality was tested with various PDF formats
- Search capabilities were verified with different query types
- Real-time features were tested for responsiveness
- Database operations were validated for correctness
-
Issue Resolution
- Database schema issues were identified and fixed during vector storage integration
- Template loading errors were resolved through proper path configuration
- PDF processing errors were handled with robust error management
- Search performance was optimized through index improvements
Here are some real examples of how the AI agent and user collaborated during development:
- Database Schema Fix
User: ERROR:root:Error processing file Mixture-of-Experts.pdf: column "document_id" of relation "documents" does not exist
Agent: The error indicates an issue with the database schema. I'll update the _create_table method to include the document_id column...
User: Confirmed it works
- Search Suggestions Implementation
User: The suggestions array remains empty as I type
Agent: Let's investigate by adding more detailed logging in the suggestions functionality...
These interactions demonstrate the iterative development process and how issues were identified and resolved through collaboration between the AI agent and user.
-
Document Processing
- Multiple PDF document upload support
- Advanced text chunking strategies
- Document management (view, delete)
- Real-time document statistics
-
Vector Search
- PostgreSQL with pgvector for persistent storage
- Efficient similarity search
- Real-time search suggestions with debouncing
-
User Interface
- Interactive search experience
- Result highlighting
- Visual loading indicators
- Document management interface
The interface is organized into four main sections:
- Upload Section: Allows users to select and upload multiple PDF files
- Document List: Displays uploaded documents with options to delete them
- Search Interface: Features a search box with real-time suggestions as you type
- Results Area: Shows the generated answer and relevant context with highlighted search terms
- Clone the repository
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
PGHOST=your_postgres_host
PGPORT=your_postgres_port
PGDATABASE=your_postgres_database
PGUSER=your_postgres_user
PGPASSWORD=your_postgres_password
OPENAI_API_KEY=your_openai_api_key
- Start the Flask server:
python -m rag_system.main
-
Open your browser and navigate to
http://localhost:5000
-
Upload PDF documents using the upload form
-
Use the search functionality to query your documents
The system can be configured through environment variables:
EMBEDDING_MODEL
: Model used for text embeddings (default: 'sentence-transformers/all-MiniLM-L6-v2')LLM_PROVIDER
: LLM provider for text generation (default: 'openai')CHUNK_SIZE
: Size of text chunks (default: 2000)CHUNK_OVERLAP
: Overlap between chunks (default: 400)
rag_system/
├── app/
│ ├── models/
│ │ └── vector_store.py # Vector storage implementation
│ ├── services/
│ │ ├── llm_adapter.py # LLM integration
│ │ └── pdf_processor.py # PDF processing logic
│ ├── templates/
│ │ └── index.html # Main HTML template
│ └── routes.py # Flask routes
├── static/
│ ├── css/
│ │ └── style.css # Application styles
│ └── js/
│ └── main.js # Frontend JavaScript
└── main.py # Application entry point
- Flask: Web framework
- PyPDF2: PDF processing
- sentence-transformers: Text embeddings
- pgvector: Vector similarity search
- psycopg2: PostgreSQL adapter
- OpenAI: LLM integration
Contributions are welcome! Please feel free to submit pull requests.
MIT License