NextJs Documentation Assistant (RAG Chatbot)

Overview

In this project, we will implement a chatbot (RAG model) that can answer questions about Next.js documentation.

Documentation Acquisition: Downloading HTML content from the Next.js Official Documentation.
HTML Scrapping: Extracting critical data, focusing on the article tag from each page.
Data Processing: Tokenizing and vectorizing the collected information.
Data Indexing: Storing the processed data in a Pinecone index for efficient retrieval.
Chatbot Creation: Using LangChain in conjunction with OpenAI models and the Pinecone index to develop a responsive chatbot.

Getting Started

Get started

Prerequisites

Setup the project

Create a python virtual environment:

python -m venv .venv

Activate the virtual environment:

source .venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Duplicate .env.template to create .env..

Remember to populate it with your OpenAI API key, Pinecone API key, and Pinecone environment name.
To download the sources, run the following command in your terminal:

python download_sources.py

To vectorize the sources, run the following command in your terminal:

python vectorize_sources.py

To start the assistant, run the following command in your terminal:

streamlit run main.py

About

In this project, we will implement a chatbot ([RAG](https://www.promptingguide.ai/techniques/rag) model) that can answer questions about Next.js documentation.

Languages

Language:Python 100.0%