This is a PyTorch implementation of Google's Onsets and Frames model, using the Maestro dataset for training and the Disklavier portion of the MAPS database for testing.
This project is quite resource-intensive; 32 GB or larger system memory and 8 GB or larger GPU memory is recommended.
- Convert wav format file into flac by ffmpeg
ffmpeg -y -loglevel fatal -i a.wav -ac 1 -ar 16000 a.flac
- Put the flac file into data/MAPS/flac(for example a.flac)
- Rename the t.tsv and put the tsv file into data/MAPS/tsv/matched (for example a.tsv)
- run evaluate.py
python3 evaluate.py model.pt --save-path output/
- The result a.mid is placed in output/