pldubouilh / spotify-gdpr-dump-analysis

analysis of complete spotify streaming dataset (endsong_*.json)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

spotify-gdpr-dump-analysis

Local analysis of complete spotify streaming dataset (endsong_*.json). Made in 3 hours alongside with chatGPT, fixing bugs as they appeared.

Ask for your GDPR streaming data dump here. It take a couple days to come.

That's a whole lot of data đź‘€

# deps
$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

# get geodb for local ip lookup
$ curl -L -o city.mmdb https://git.io/GeoLite2-City.mmdb

# create sqlite3 database from json dump. datafolder should contain all your endsong_*.json files
$ python makedb.py datafolder/

# run analysis !
$ python map-ips-city.py

a

$ python top-cities.py
df                     city         country  count
20                   Berlin         Germany   2629
...
$ python top-songs-per-country.py
DE                                                 La femme d'argent                   Air
DE  Piano Concerto No. 3 in D Minor, Op. 30: I. Allegro ma non tanto   Sergei Rachmaninoff
DE                                La mer, L. 109: II. Jeux de vagues        Claude Debussy
DE                                                   Samba da Bencao        Bebel Gilberto
DE                                      Merry Christmas Mr. Lawrence      Ryuichi Sakamoto
DE                                                        WEIGHT OFF            KAYTRANADA
...

About

analysis of complete spotify streaming dataset (endsong_*.json)

License:MIT License


Languages

Language:Python 100.0%