VarDaan.ai is an AI-powered platform that transforms any web content—be it blogs, articles, YouTube videos, or PDF documents—into a chatbot that users can interact with. By providing a URL or uploading a file, you can ask natural language questions about the content and receive accurate, contextually relevant answers in real-time.
- Web Content as Chatbots: Turn any web page, blog, or article into an interactive chatbot.
- YouTube Video Chatbot: Input a YouTube video URL and ask questions based on video content.
- PDF Support: Upload PDF files and generate an AI chatbot to query the content.
- Conversational AI: Powered by advanced natural language processing models that provide accurate answers based on the context.
- Simple User Interface: Easy-to-use web interface that allows users to engage with different types of content seamlessly.
Make sure you have the following installed:
- Python 3.8+
- pip (Python package installer)
- Git
-
Clone the Repository
Open your terminal and run the following command to clone the repository:git clone https://github.com/amMistic/vardaan.ai.git
-
Navigate to the Project Directory
Move into the project folder:cd vardaan-ai
-
Install Dependencies
Install the required Python packages by running:pip install -r requirements.txt
-
Set Up Environment Variables
Create a.env
file in the root directory and add the necessary environment variables (like API keys). Example:PINECONE_API_KEY=<your_pinecone_api_key> HUGGINGFACE_API_TOKEN=<your_api_token>
-
Run the Application
Start the VarDaan.ai application using Streamlit:streamlit run app.py
-
Access the App
Open your web browser and navigate to the local server link provided by Streamlit (usuallyhttp://localhost:8501
).
- Web Interface: Once the app is running, you will see a simple input field for URLs or file uploads (PDFs).
- Enter Content:
- For blogs/articles: Enter the URL of the blog or article.
- For YouTube videos: Enter the YouTube video URL.
- For PDFs: Upload the PDF document directly into the app.
- Interactive Chat: After processing the content, you can ask questions in the chat interface, and VarDaan.ai will respond based on the content provided.
- Blog/Article:
vardaan.ai.http://example.com/blog-post
- YouTube Video:
vardaan.ai.youtube.com/watch?v=example-video
- PDF: Drag and drop a PDF document into the interface.
vardaan-ai/
│
├── app.py # Main application entry point
├── src/ # Source code
│ ├── Online_src/ # Web content and YouTube processing
│ ├── Offline_src/ # PDF handling
│ ├── Handle_user.py # Handles user queries and responses
│ ├── embedding_model.py # Embedding logic for vector storage
├── vecDatabase/ # Stores vectorized representations of content
├── requirements.txt # List of Python dependencies
└── README.md # Project documentation
-
Extracting Content:
- Web Content: VarDaan.ai fetches and processes text from web pages or articles using web scraping methods.
- YouTube: The app uses YouTube's transcript API to extract the spoken text from videos.
- PDF: For PDF files, VarDaan.ai extracts the textual content and splits it into manageable chunks.
-
Processing & Storage:
- The content is split into smaller text chunks.
- Each chunk is embedded using a pre-trained NLP model, converting the text into a vector format.
- These vectors are stored in a vector database (Chroma) for efficient querying.
-
Conversational Queries:
- When the user asks a question, VarDaan.ai retrieves relevant information from the vector store.
- The retrieval system uses advanced language models to generate appropriate, context-aware responses.
- Python: Core programming language
- Streamlit: Web interface framework
- LangChain: Used for managing document chains and information retrieval
- Chroma: For vector storage and search capabilities
- Hugging Face Models: Provides the embeddings for text representation
- YouTube Transcript API: Used to fetch video transcripts
- pdfplumber: For handling PDF text extraction
I welcome contributions from the community! To get started:
- Fork the repository.
- Create a new branch:
git checkout -b feature/your-feature-name
- Make your changes and commit them:
git commit -m "Add a new feature"
- Push to the branch:
git push origin feature/your-feature-name
- Create a pull request on GitHub.
git pull origin feature/your-feature-name