Poor results on Mandarin singing voice data

Question

Poor results on Mandarin singing voice data

WelkinYang opened this issue 2 years ago · comments

HeyangXue1997 commented 2 years ago

Thank you for your work. I used this repository to experiment on a Mandarin singing voice dataset, the training result of 50w steps is not satisfactory, the main problem is that the spectrum looks like stitching together one by one Chunk, there are very obvious vertical line streaks(can be clearly heard).

I am using the default hyperparameter configuration, how should I avoid this problem?

Max Morrison · Answer 1 · Sat May 14 2022 03:20:54 GMT+0800 (China Standard Time)

Boundary artifacts may arise for some types of data. This is mentioned in the paper. If you figure out how to address those artifacts, we'd love to know as well.

My guess is that the better the GAN can fit the data distribution the less likely the artifacts. So more data, more steps, and simpler (e.g., denoised, dereverbed) data will probably help.