YouTube Channel Content Analyzer with Source Attribution
pip install -r requirements.txt
Set the necessary environment variables for API access:
export GOOGLE_APPLICATION_CREDENTIALS=XX
export ANTROPHIC_API_KEY=XX
export OPENAI_API_KEY=XX
This script will list all the videos in a channel given a video ID, then download transcripts and metadata for each video. After that, it will ingest the data into a Chroma vector store.
Run the following command to prepare the data:
./ingest.sh
To start the application, execute:
python main.py
Once the application is running, you can interact with it to ask science questions based on the ingested video content.
Model by default is set to Antrophic's Claude 3.5 sonnet
You can change the model provider in the application by modifying the following lines in your code in models/chat_model.py
PROVIDER = "openai" # Change to your desired provider
MODEL = "gpt-4o" # Specify the model you want to use
hubegpt = HubeGPT(provider=PROVIDER, model=MODEL)
If you encounter issues, check the following:
- Ensure all environment variables are set correctly.
- Verify that the required packages are installed.
- Check the logs for any error messages.
Contributions are welcome! Please submit a pull request or open an issue for any enhancements or bug fixes.