Uses LangChain to do "Retrieval Augmented Generation" over the MusicCaps song description dataset.
Live demo at musebot.streamlit.app 🤗
![](https://private-user-images.githubusercontent.com/20889454/302109786-22699edd-640b-4787-bf9d-f2df1f6954aa.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjI5NDQ1MTEsIm5iZiI6MTcyMjk0NDIxMSwicGF0aCI6Ii8yMDg4OTQ1NC8zMDIxMDk3ODYtMjI2OTllZGQtNjQwYi00Nzg3LWJmOWQtZjJkZjFmNjk1NGFhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MDYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODA2VDExMzY1MVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTY0NjAzNDQzNzFjOThmMTE2ZDIxYjQ3ZWY5NWI1NTcxYmNmZmFjMjM4NzRlM2E4ODI3ZGY0ZWMxMzM1MTAzOTgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.HuivITWznH2kg3P_aQuDm-BYsDI8ju7qCBuOlxU7vn0)
Follow these steps to run musebot locally:
git clone https://github.com/kutay25/musebot.git
cd path/to/project
VSCode is convenient to setup virtual environments: Environments using the create environment command
For Mac,
python -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt
streamlit run src/main.py
The app works by:
- Taking a prompt from the user
- Combining the prompt and the chat history into a single standalone question using a chat model
- Feeding the standalone question into the vector-database retriever, which returns a context (4 relevant songs, their links and description)
- Combining the standalone question and context into a single prompt, which is then finally answered by a chat model.
- Appending the message and any relevant video to the chat interface, and repeat.
-
The generation of prompts, and chaining them together is done using LangChain. They also have a great tutorial for generating Q&A applications with chat history.
-
As the retriever (objects returning relevant chunks of text from a text or .csv source), FAISS is used, and is available in LangChain.
-
The dataset, over which the retriever retrieves relevant songs, is based on MusicCaps Song Description Dataset. The dataset has been processed in the following ways: A certain number of relevant songs were picked, removing unrelated videos, like guitar song lessons or low quality videos. Afterwards, irrelevant features were removed, and each entry's title was fetched from Youtube using Youtube's API. Afterwards, the dataset is read by the Embedder object, who calls OpenAIEmbeddings to create a vector representation of the dataset. Finally, the FAISS library creates a vector-database/vector-store, which is then saved locally - by reloading from the saved vector-store, time costs are reduced whenever the streamlit session reupdates by interactions and the embeddings API costs are highly reduced.