allanino / rnn-composer-helper

Scripts I used to process ABC files and train char-rnn on them.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RNN composer help

Here I share some scripts I used to help me on training char-rnn to learn to compose tunes in ABC notation.

I keep a blog called RNN music of the day where each day I posted one tune generated by this method.

I don't share my dataset as I'm not sure about possible copyright issues.

Pre-requisites

You'll need to install Torch and the other requirements from char-rnn. Just follow the instructions.

I rely heavily on system calls to get the job done, so you'll need to install some tools on your machine for these scripts to work:

Some python modules that can be installed by:

pip install beautifulsoup4 requests

Beautiful Soup is used by a crawler I wrote to get some more data.

We also need a dataset (obviously) and some soundfounts to convert the generated music to MP3.

Let's see we can get files to train the network.

Getting files to train the network

I searched the web with preference to huge ABC collecions. The most important files I found are these:

From the above links, I must emphasize the Hanny Christen collection of Swiss folk music for two reasons: it's huge with over 10000 tunes (about 4.4 MB after I preprocessed it) and all it's musics have chords, giving rise to richer compositions. I have indeed trained the network on this data alone and got great results.

If I forgot something, please open an issue and I'll update this.

My workflow

I'll just give an overview here. Each script has some helper, just pass the flag -h to them.

  1. Use preprocess.py to concatenate and shuffle all files into one big file.
  2. Train char-rnn on that file.
  3. Sample trained network to generate a big file with compositons (using seqlen=100000 tipically).
  4. Use file_to_midi_and_abc.py to convert that big file into many MIDI files and ABC files.
  5. Use abc_to_mp3.py to convert the MIDI files into MP3 files.
  6. Use abc_to_svg.py to convert the ABC files into SVG sheets.

Step 5 needs one more clarification: I used sounfount 754-Donnys Guitar to synthesize better sounding files.

Tips on training char-rnn

Be careful to not overfit. Specially when using small datasets. One time I found marvelous results on a 200 KB dataset just to discover that some tunes were plagiarism of the training set. One of the indications of overfitting is to have evaluation loss much larger than training loss. In that case, you should use less memory units or get a larger dataset.

About

Scripts I used to process ABC files and train char-rnn on them.

License:MIT License


Languages

Language:Python 100.0%