etzinis / sudo_rm_rf

Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Obtained results of SI-SNRi=18.9?

JusperLee opened this issue · comments

What kind of script can be used to make SI-SNRi reach 18.9?

python run_improved_sudormrf.py --train WHAM --val WHAM --test WHAM --train_val WHAM --separation_task sep_clean --n_train 20000 --n_test 3000 --n_val 3000 --n_train_val 3000 --out_channels 512 --num_blocks 34 -cad 0 1 -bs 4 --divide_lr_by 3. --upsampling_depth 5 --patience 49 -fs 8000 -tags sudo_rm_rf_34 --project_name sudormrf_wham --zero_pad --clip_grad_norm 5.0 --model_type relu

thanks a lot

I can't get the result of 18.9 with this parameter. I would like to ask if you have a log file for reference. @etzinis

What is the result you are getting ? Are you using my code 100%?

I double checked the result with my implementation here and an asteroid implementation as well

val_sisnri: 18.205505715329497. I used your code 100%.

Also you have to let ot run for 200 epochs

but my code batch-size is not 4, I set it 16.

So it's not the same...

ok, i will set batch size to 4 and test again.

this is validation on test:
image

I used patience = 30 here but this will not make a difference

I will try again to see if I can achieve this result.

Btw how did you manage to fit a batch size of 16 in 2 GPUs or did you use more?

with 8 gpus, each gpu is a batch size of 2.

Wow glad that you have the resources.

Also make sure you have the latest version of the code: git pull

image
Is this result normal?

yes totally normal. You have not even reduced the learnign rate for a second time and you are already at 18.2 probably you will score better than the paper :P

That's great, thanks a lot.

Hello, I cannot achieve the results in the paper by any method (SI-SNRi=18.9). I want to know a more detailed trick. @etzinis
image

How much are you getting?

SI-SNRi = 18.66

First of all, if I remember corresctly you were running with different batch size or something on multiple GPUs so that's one difference.

Despite that, round(18.66, 1) = 18.7 => 18.9 - 18.7 = 0.2 dB (almost 1% relative difference) which is not statistically neither acoustically different by any means.

There is no other trick besides running the code as specified in the paper.

Okay, then there is no problem.

Thank you very much for your answers