How to use model correctly?

Question

How to use model correctly?

swlaiab opened this issue 5 years ago · comments

I input different group of vocal, back_track and pYIN files to try. They can all be successfully divided into notes and do the shifting. However, after evaluation and output generation, using the synthesis.py to listen to the shifted and corrected version of the song, it seems not doing much tuning. The graph of like ground truth vs shift prediction, the prediction shift (blue line) is mainly a horizontal line with little amplitude shift which is quite different to the ground truth(red line). It seems like I cannot utilize the model in a correct way. Is there any steps missing to achieve the tuning and test successfully below:

generate pYIN and change it to npy and input
input vocal wav
input backtrack wav
change intoncation.csv
program args, testing instead of training be true

Thanks for your attention.

Sanna Wager · Answer 1 · Sat Mar 21 2020 04:17:53 GMT+0800 (China Standard Time)

Hi, sorry to hear that you ran into this issue! Would you mind sending the full list of arguments you used when running the program? Did you set resume to true? I am also curious whether the prediction was exactly a horizontal line or only had values that were close.

sun-peach · Answer 2 · Wed Jun 17 2020 12:31:29 GMT+0800 (China Standard Time)

@sannawag Hi, Sanna, I use the sonic annotator to extract pitch by pYIN, but its output only has two columns of data. When I check your demo, your csv file has four column, could you tell me how you get that? Thanks!

ruslanakhmetov1986 · Answer 3 · Sat Dec 12 2020 08:40:33 GMT+0800 (China Standard Time)

@sannawag Привет, Санна, я использую звуковой аннотатор для извлечения высоты звука по pYIN, но его вывод содержит только два столбца данных. Когда я смотрю вашу демонстрацию, в вашем CSV-файле четыре столбца, не могли бы вы рассказать, как вы это получили? Благодаря!

Could you get 4 columns?