📄 Project Documentation: Chatbot for Immersive Experiences

This project contains two main Python files:

data_processing.py – Handles PDF processing, chunking, and vector embedding.
app.py – A Streamlit web interface for interacting with a chatbot based on the processed data.

📁 File Overview

A Streamlit UI that allows users to interact with the embedded data.
Uses LangChain’s RetrievalQA for answering questions based on the Pinecone vector store.
Displays chat history and responses dynamically.

This project relies on environment variables for API keys and configuration. These should be stored in a .env file located in the project root.

.env Example:

OPENAI_API_KEY=your_openai_key_here
PINECONE_API_KEY=your_pinecone_key_here
PINECONE_ENV=your_pinecone_env
PINECONE_INDEX_NAME=your_index_name

⚠️ Important:
Make sure .env is added to your .gitignore file to avoid accidentally leaking your keys.

# .gitignore
.env

API keys are loaded using os.getenv(...):

In data_processing.py:

openai_api_key = os.getenv("OPENAI_API_KEY")

In app.py:

openai_api_key = os.getenv("OPENAI_API_KEY")

If you want to switch environments or services, simply change the keys in your .env file — no code modification is needed.

Install Dependencies
```
pip install -r requirements.txt
```
Prepare .env
- Create a .env file and insert your API keys as described above.
Run the Streamlit App
```
streamlit run app.py
```

The SOURCES and IDs arrays in data_processing.py determine which PDFs are loaded. Add your PDFs there if needed.
Make sure the names in SOURCES match actual filenames in your project directory.
Ensure pinecone and openai services are correctly set up before running the app.