ai chat-ui chatbot context-aware document-search embeddings faiss file-metadata langchain llm llms nlp openai pdf pdf-parsing python rag semantic-search streamlit vector-store

🧠 Chatbot Using RAG and LangChain

A Streamlit-based chatbot powered by Retrieval-Augmented Generation (RAG) and OpenAI. Upload your PDFs and chat with them! This app leverages LangChain, FAISS, and OpenAI’s GPT models to extract and query document content with metadata-aware answers.

🔧 Features

🔍 Upload multiple PDFs and query across all of them
📄 Metadata-rich answers with filename and page references
🧠 Uses LangChain + FAISS for semantic search
🤖 Streamlit Chat UI for natural conversation
💾 OpenAI API support with streaming responses

📁 Project Structure

.
├── .gitignore
├── LICENSE
├── README.md             # ← You're reading it
├── app.py                # Main Streamlit app
├── brain.py              # PDF parsing and vector index logic
├── compare medium.gif    # Optional UI illustration
├── requirements.txt      # Python dependencies
└── thumbnail.webp        # Preview image

🚀 Getting Started

1. Clone the Repository

git clone https://github.com/co-dev0909/chatbot-using-rag-and-langchain.git
cd chatbot-using-rag-and-langchain

2. Install Dependencies

pip install -r requirements.txt

3. Set OpenAI API Key

Create a .streamlit/secrets.toml file with:

OPENAI_API_KEY = "your-openai-key"

Or export it via environment variable:

export OPENAI_API_KEY="your-openai-key"

4. Run the App

streamlit run app.py

📚 How It Works

Upload PDFs via the UI
Each PDF is parsed using PyPDF2 and chunked via LangChain’s RecursiveCharacterTextSplitter
Chunks are embedded using OpenAI Embeddings
Stored in a FAISS vector store for semantic similarity search
Queries are matched to top PDF chunks and passed to ChatGPT with context
Answers include file name and page number metadata for citation

🛠️ Tech Stack

Streamlit – UI framework
LangChain – PDF chunking and retrieval
FAISS – Vector search backend
OpenAI GPT – LLM-based answer generation
PyPDF2 – PDF parsing

✅ Example Prompt

"What are the main points from the introduction?"

Answer: The introduction highlights... (example.pdf, page 1)

📄 License

This project is licensed under the MIT License.

📬 Contact

Made with ❤️ by co-dev0909. Contributions welcome!

About

Chat with your PDFs using AI! This Streamlit app uses RAG, LangChain, FAISS, and OpenAI to let you ask questions and get answers with page and file references.

ai chat-ui chatbot context-aware document-search embeddings faiss file-metadata langchain llm llms nlp openai pdf pdf-parsing python rag semantic-search streamlit vector-store

MIT License

Languages

Language:Python 100.0%