phoneme-informed-note-level-singing-transcription

A pretrained model for "A Phoneme-informed Neural Network Model for Note-level Singing Transcription", ICASSP 2023

Requirements

Or if you are using Poetry, you can install the dependencies by running

$ poetry install

$ python infer.py checkpoints/model.pt INPUT_FILE OUTPUT_FILE --bpm BPM_OF_INPUT_FILE --device DEVICE

INPUT_FILE is the path to the input audio file.
OUTPUT_FILE is the path to the output MIDI file. (If you do not give this argument, the default file name will be out.mid.)
BPM_OF_INPUT_FILE is the BPM of the input audio file. (If you do not give this argument, the default value will be 120.)
DEVICE is the device to run the model. (If you do not give this argument, the default device will be cuda:0 if available, otherwise cpu.)

To pull the model checkpoint from the GitHub repository, Git LFS is needed.

For people who suffer for downloading the model checkpoint through Git LFS, I uploaded the model checkpoint in this link.

A pretrained model for "A Phoneme-informed Neural Network Model for Note-level Singing Transcription", ICASSP 2023

MIT License

Language:Python 100.0%