maum-ai / nuwave

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling @ INTERSPEECH 2021

Home Page:https://mindslab-ai.github.io/nuwave/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upsample file has static in the background or complete silence

david-littlefield opened this issue · comments

Hello-there, Junhyeok & Seungu,

My name is David. I'm writing an article about your awesome repository for the Level Up Coding publication on Medium.

I'm still new to deep learning so I've been stumbling a bit through your implementation.

I think your paper mentioned that 8 epochs produced the similar results as 1000 epochs.

I trained the model with a 1080 ti 11gb using a batch size of 3 for 7 epochs so far.
It created a checkpoint file for the 5th epoch.
It also created a ema checkpoint file for the 7th epoch.

Here's the strange part...

The regular checkpoints produce an upsample file that has constant static in the background.
The ema checkpoints produce an upsample file with complete silence.

Would either of you be able to help shed some light on how to make the most of your awesome repository?

With appreciation,

David

Hi David, for the first, thank you for reading our paper and writing an article!

I think you are confusing with the term noise schedules’s steps(what you mentioned as 8 “epochs” and 1000 “epochs”) and training epoch(one loop for dataset). 8 and 1000 are iteration numbers for sampling. Training epochs are needed more!(We trained over 2 weeks with two A100 or V100). Since our paper is short for INTERSPEECH template, I recommend to read references such as Denoising Diffusion Probabilistic Models, DiffWave, and WaveGrad.

For your questions, since too much validation slow down training, we set validation period by “checkpoint_val_every_n_epoch=2” in trainer.py file and save top-k loss models.
On the other hand, EMA does not need a data loading and gpu computation, thus it is updated at every epoch end and save all recent-k models.

And next parts, previous researches reports results with EMA checkpoints, so we tried too. However, we found out EMA checkpoints are not very different with regular checkpoints. I think your strange parts were occurred it is trained with small epoch numbers.

Awesome, thank you!