Didayolo / hsmusic

Huge Symbolic Music Dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Huge Symbolic Music Dataset (HSMusic)

HSMusic is a large and participative MIDI collection. More precisely, it is:

  • A joint effort to gather as much MIDI files as possible; and automatically tag them.
  • A small Python library to feed Machine Learning models with MIDI files.

logo

Count 130 943
Data format MIDI
Tags Style, composer, title and many more (multilabels)

How to contribute?

If you want to add MIDI files to HSMusic Dataset, please contact me: adrien.pavao@gmail.com. You can also become a contributor of this repository or raise Github issues.

Roadmap

  • Complete the data collection
  • Clean the code for data management and labeling
  • Code ML baselines (discriminative and generative models)
  • Write documentation and example notebook

Library overview

  • to_matrix: convert MIDI file into a binary matrix
  • to_midi: convert a binary matrix into a MIDI file
  • Some models...
  • TODO: read_data, data_augmentation (transposition, rythm, etc.), change_mode

Sources

About

Huge Symbolic Music Dataset