anbinh / musescore-dataset

The dataset of all music sheets and users on musescore.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

musescore-dataset

The unofficial dataset of all music sheets and users on musescore.com, dedicated to big data analytics / data science / machine learning.

All data is collected by iterating through musecore.com's public API.

The jsonl files are in the Newline-delimited JSON (JSON Lines) format.

Only need the sheet files to learn music? try musescore-downloader.

View/Query in Google BigQuery

User Data

Update Manually,
Last Updated: Nov 9, 2020

https://musescore-dataset.xmader.com/user.jsonl

Music Sheet Metadata

Update daily at 7:10 am ET (UTC-5 / UTC-4 Daylight Saving Time)

https://musescore-dataset.xmader.com/score.jsonl

All mscz files

Update daily at 7:10 am ET (UTC-5 / UTC-4 Daylight Saving Time)

https://musescore-dataset.xmader.com/mscz-files.csv

# The CSV file itself is on IPFS
ipns="QmSdXtvzC8v8iTTZuj5cVmiugnzbR1QATYRcGix4bBsioP"
cid=$(curl https://ipfs.io/api/v0/dag/resolve?arg=/ipns/$ipns/ | grep -o "\\w\{46\}")
wget -O mscz-files.csv https://ipfs.infura.io/ipfs/${cid}/mscz-files.csv

This is a csv file, which contains score id (id) and the corresponding IPFS reference (ref) to each mscz file.

All files are available on IPFS.
NO ONE CAN TAKE IT DOWN NOW!

Bulk Download

See https://discord.com/channels/774491656643674122/774491656643674128/784661028310220820

(You must join the LibreScore Community Discord first to see the message.)
Discord

Download mscz files via IPFS HTTP Gateways

#!/bin/bash
while IFS=, read -r id ref
do
    wget -nv https://ipfs.infura.io$ref -O $id.mscz
done < <(sed '1d' mscz-files.csv)

Or using local IPFS daemon

#!/bin/bash

# Install IPFS https://docs.ipfs.io/how-to/command-line-quick-start/#install-ipfs

ipfs daemon --init &

while IFS=, read -r id ref
do
    ipfs get $ref -o $id.mscz
done < <(sed '1d' mscz-files.csv)

Contact me if you have any questions.

The purpose of the project is to make the data of musescore.com accessible to anyone in need, and bring a clean and high-quality music dataset to the world of computer science, but not for individuals who only want to keep the data pointlessly.

Special Thanks

I would like to thank Luca B., telling me that what I am doing is meaningful.

About

The dataset of all music sheets and users on musescore.com