interactiveaudiolab / penn

Pitch Estimating Neural Networks (PENN)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Steps to reproduce the reported RPA accuracy of 98%

anthonio9 opened this issue · comments

Hi,

I'm working on a project similar to yours, but solely focused on guitar pitch recognition. To have a better look into penn models training I've integrated Weights & Biases into the project, checkout my fork. I'm pretty sure the only thing I've changed in the config is the LOG_INTERVAL value, by setting it to 500, however the training and validation accuracy reported during the training oscillate around 50%, similar results are reported by the evaluation done after the model is trained.

The figures below are the result of my take at training the fcnf0++ model from scratch.

Training accuracy reported every epoch
image

Validation accuracy reported every epoch
image

Training loss
image

The estimated performance is included in the overall.json file generated by the training script.

It's clear that I'm missing something, do you have any advice on steps to achieve the best results, perhaps some issues in my take that are obvious? I tried to follow the README instructions, download, preprocess and partition the mdb and ptdb datasets according to fcnf0++ config, then run the training. In the overal.json file it's reported that evaluation on mdb reaches around 60% and ptdb only around 20%.

overall.json

My bad, the RCA, RPA and RMSE metrics are as reported in the paper, it was just my mistake in reading the evaluation results. Thanks for your great papers and code!