jacksonbull87 / Roses-Explosion

Analyzing TikTok data to identify KPIs that define the most popuar songs of Summer 2020

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sounds of Summer 2020

An Analytical Approach to Finding The Song of the Summer

Introduction

Summer is typically an exciting time for music fans; as the days get longer and the temperature starts to heat up, positive and lively new music releases provide the perfect soundtrack to BBQs, trips to the beach, roadtrips, blockparties, and other forms of real life escapism. Using data from TikTok's weekly music charts, the goal of this project is to examine the most popular songs as defined by the following KPIs:

  • Longevity (Time Spent on Chart)
  • Velocity (Biggest Change in Rank Over a 7-Day Period)
  • Spotify Popularity (Biggest Increase in Popularity Score)

Sublime's custom image

Technology

Data Source: ChartMetric

Date Range: 05/25/20-08/15/20

Language: Python

Visualization: Tableau

Data Collection

ChartMetric API - TikTok Weekly Chart Data

With the help of ChartMetric's API, I was able to gather data on the top 100 weekly tracks on TikTok from May 2nd to August 15th. In order to make api calls to CM's server, I'll need an api token which I can get by importing it from my config file: from cm_api import token. For security purposes, I locked away the refresh token in a config file (on gitignore). refresh_token = token['refresh_token'] - Now that I have a refresh_token, I can use the following helper functions:

  1. Get my temporary api_token from cm_api import get_api_token

    • Parameters: refresh_token
  2. Get TikTok chart data from cm_api import get_tiktok_chart_data

Once I extracted all the charts, I concatenated each dataframe along axis 0 to create one master dataframe: master_df.to_csv('datasets/historic_ttwk.csv') Each row represents a chart position for a specific week, so there may be duplicate artists is an artists spend more than a week onthe charts.

I was able to fetch chart positions from the past 4 months:

  • 1500 Chart Position
  • 375 Unique Tracks
  • 351 Unique Artists

Feature Engineering

In addition to my primary goals of this project, I also want to analyize the distribution of tracks in terms of genre and era. So using my api function from cm_api import get_track_meta, I grabbed the tags associated with each cm_id. Then using a dictionary object, I mapped each genre to its cm_id, engineering a new feature:

ttwk_mstr['track_genre'] = ttwk_mstr['cm_id'].map(track_genre)

Using one of Pandas' built-in datetime methods, I engineered a new feature for era:

ttwk_mstr['era'] = ttwk_mstr['release_date'].dt.year

Dataset Features Note: The features listed below does not represent all features in my master dataframe-these are just the ones I ended up using for my analysis

track_name artist_name cm_id label release_date rank add_date velocity peak_rank peak_date time_on_chart track_genre year

ChartMetric API - Spotify Popularity Data For Artists

In order to show historic trends in popularity score for our TikTok artists, I created another API helper function to retreive data over the past 12 months, begin_date = 2019-09-02, but only for artists who had a peak rank between 1 and 10. from cm_api import get_fan_metrics

Dataset Features

timestamp artist cm_artist_id popularity

Data Located Here -->

Visualizations - Key Performance Metrics

Longevity (Time On Chart)

The visualization below reveals the top trending TikTok tracks sorted in descending order of time most spent

Link To Interactive Vizualization

About

Analyzing TikTok data to identify KPIs that define the most popuar songs of Summer 2020


Languages

Language:Jupyter Notebook 94.6%Language:Python 5.4%