For this project, the data was collected from Spotify using Spotify’s API and ‘spotipy’ library. The data is on Best Hindi songs 2023 in United States (ranked weekly)
https://open.spotify.com/playlist/0zc6Hq9OIAengtGG6a3lfs
Services User: Google Cloud Storage, IAM and admin, Google Function, Cloud Scheduler, Google Pub/Sub, Snowflakes
Process:
- Created Spotify developer account to generate API keys.
- Assigned roles and policies for the services used thorough IAM.
- Created custom python package and generated custom layer to support spotipy library in GCP in file name requirement.txt.
- Coded function to extract raw data using Spotify API and ‘spotipy’ python library in Google function
- Automated data extraction using Cloud Scheduler for a specified time and store data in ‘raw_data’ folder in Cloud Storage bucket.
- Executed transform function to convert the raw data into 3 csv files – album, artist, songs.
- Automated the process of transform function and to copy the transformed files into ‘processed_data’ folder and delete the data from ‘raw_data’ folder.
- Designed tables for album, artist and songs in snowflakes.
- Created storage integration to access the data from cloud storage. Made file format for csv file.
- Created snowpipe by leveraging google Pub/Sub for automatically retrieving of the data when the raw_file appear in raw_data folder.
- Queried the tables
It was a great experience using API to collect data and then transform this raw data into legible data using GCP. This was a fun project as I was able to work with real time data which keeps on updating.
Spotify Developer : https://developer.spotify.com
![image](https://private-user-images.githubusercontent.com/82742908/275697367-cbc62ff3-e71c-454b-bdc9-ded839f20758.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4NjU4NjYsIm5iZiI6MTcxOTg2NTU2NiwicGF0aCI6Ii84Mjc0MjkwOC8yNzU2OTczNjctY2JjNjJmZjMtZTcxYy00NTRiLWJkYzktZGVkODM5ZjIwNzU4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzAxVDIwMjYwNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWRhNTZiNzk2ZDFiOWM3MWRlNjgzNGEzZDc2Yzc3YWUxYWU4ZmYzYWY3ZTFjMDVhODBmYTdhOTkxYjNiZmU5MWUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.EqVP8orwQ9fWPyjWoH0Bax7oTu41alvDSIrGD_VuSl4)