DocChat: Intelligent Document Q&A with Adjustable Response Style & Difficulty via Retrieval Augmented Generation (RAG)
DocChat is a web application which uses Retrieval Augmented Generation (RAG) to answer user queries about a document given a desired response type and complexity.
The application utilizes:
- LangChain for chaining LLM operations
- OpenAI's text-embedding-ada-002 model for document embeddings
- Pinecone for vector storage and similarity search
- Streamlit for the user interface
Below are some of the key features of DocChat:
- Upload & Process PDFs: Users can upload their PDF files to be processed instantly
- Dynamic Querying: Ask questions related to document content with AI-driven responses
- Response Customization: Adjust response format and difficulty level
- Cloud-based Processing: Ensures quick and accurate retrieval of document insights.
Prerequisites: Python 3.8+, Pip, Git
- Clone this repository and navigate to the local project folder. Activate your virtual environment (optional).
- Install dependencies
pip install -r requirements.txt
- Create .env file in main directory
touch .env
- Set your environment variables
OPENAI_API_KEY=your-openai-api-key PINECONE_API_KEY=your-pinecone-api-key INDEX_NAME=your-pinecone-environment
- Run the app
streamlit run app.py
Author: Ronoy Sarkar
Project Link: https://github.com/ronoys/DocChat