This repository contains scripts for analyzing music features, training recommendation models, and building a real-time recommendation system using Apache Kafka.
- load1.py: Extracts audio features using Librosa and prints them for each audio file. Also includes plotting of normalized MFCCs, Spectral Centroid, and Zero-Crossing Rate.
- mongodb1.py: Inserts audio features into MongoDB for storage and retrieval.
- connector.py: Connects Apache Spark with MongoDB to read data into Spark DataFrames.
- PHASE2.py: Trains a music recommendation model using Annoy and performs nearest neighbor search.
- producer.py: Streams music features to Kafka for real-time processing.
- consumer.py: Consumes music recommendations from Kafka and applies them.
- app.py: A web application to upload audio files and get insights.
- index.html: A simple web interface for uploading files and displaying insights.
- Clone the Repository: Clone or download the repository to your local machine.
git clone <https://github.com/tashi-2004/BDA-PROJECT-FMA-A-Dataset-For-Music-Analysis>
- Install Dependencies: Install the required Python dependencies using:
pip install -r requirements.txt
- Set Up MongoDB: Ensure MongoDB is installed and running on your system. Update MongoDB connection strings in the relevant scripts.
- Set Up Kafka: Install and run Apache Kafka on your system. Update the Kafka broker address in
producer.py
andconsumer.py
. - Run Scripts: Execute the scripts in the following order:
- Extract and Visualize Audio Features: Run
load1.py
to extract and visualize audio features from your audio files. - Store Audio Features in MongoDB: Run
mongodb1.py
to store the extracted audio features in MongoDB. - Data Analysis with Spark: Run
connector.py
to connect Spark with MongoDB and perform data analysis using Spark DataFrames. - Train Recommendation Models: Run
PHASE2.py
to train music recommendation models using Annoy and perform nearest neighbor searches. - Stream Music Features to Kafka: Run
producer.py
to stream music features to Kafka for real-time processing. - Consume Music Recommendations from Kafka: Run
consumer.py
to consume music recommendations from Kafka and apply them. - Web Interface for Audio Files: Use
app.py
andindex.html
to upload audio files via a web interface and get insights.
- Extract and Visualize Audio Features: Run
- Extract and Visualize Audio Features: Run
load1.py
to extract and visualize audio features from your audio files. - Store Audio Features in MongoDB: Run
mongodb1.py
to store the extracted audio features in MongoDB. - Data Analysis with Spark: Run
connector.py
to connect Spark with MongoDB and perform data analysis using Spark DataFrames. - Train Recommendation Models: Run
PHASE2.py
to train music recommendation models using Annoy and perform nearest neighbor searches. - Stream Music Features to Kafka: Run
producer.py
to stream music features to Kafka for real-time processing. - Consume Music Recommendations from Kafka: Run
consumer.py
to consume music recommendations from Kafka and apply them. - Web Interface for Audio Files: Use
app.py
andindex.html
to upload audio files via a web interface and get insights.
You can customize the scripts according to your requirements, such as adjusting feature extraction parameters, changing MongoDB or Kafka configurations, or modifying recommendation model algorithms.
- Tashfeen Abbasi
- Laiba Mazhar
- Rafia Khan
Feel free to contribute to this project by submitting issues or pull requests. Enjoy analyzing and recommending music with this comprehensive toolkit!