Using LSD objective method to evalate the upsample wav, I can't get the same results on the paper.

Question

Using LSD objective method to evalate the upsample wav, I can't get the same results on the paper.

DYJ1111 opened this issue 5 years ago · comments

I review the DNN network, using single speaker dataset in vctk, but when i evalate the upsample wav, I can't get the same results. It maybe my LSD code faults or what?
can you please tell me the split about train and validation datasets?
thank you.

J.K · Answer 1 · Tue Nov 26 2019 23:26:03 GMT+0800 (China Standard Time)

iam trying to evaluate it too, at the moment i couldnt get the same results too. It would be nice to have the original model.
How did you make the source code run? i had to merge two different repos and fix many problems. I wish i could just use the original and it worked out of the box..
Could you make the original version run?

DYJ · Answer 2 · Wed Nov 27 2019 11:24:55 GMT+0800 (China Standard Time)

just run the train code, but can't test the model.

J.K · Answer 3 · Wed Nov 27 2019 21:01:53 GMT+0800 (China Standard Time)

how couldnt you test it?
do you have your own repostitory?

checkout my repository if you want i forked the project.

i run this to validate

python run.py eval --logname ./full_end_normalized_parmas_ps8.s8-audiounet.lr0.00300.1.g4.b32.d8192.r4.lr0.000300.1.g4.b8/model.ckpt-33 --out-label singlespeaker-out --wav-file-list ../data/vctk/speaker1/speaker1-val-files.txt --r 4

where full_end_normalized_parmas_ps8.s8-audiounet.lr0.00300.1.g4.b32.d8192.r4.lr0.000300.1.g4.b8/ was exported after training.

split and validation as far as I have understood is in those files:

audio-super-res\data\vctk\speaker1\speaker1-train-files.txt
audio-super-res\data\vctk\speaker1\speaker1-val-files.txt

I modified the fails to test the code and just used 10 files each. That will be faster.

J.K · Answer 4 · Wed Nov 27 2019 22:54:36 GMT+0800 (China Standard Time)

just run the train code, but can't test the model.

what code did you use? the main repo from kuleshov? what libraries did you use? i couldnt manage to run it with the libs in the description. I had to change from python 2.X to 3.X and change tensorflow and keras versions.

Shuai Yuan · Answer 5 · Tue Dec 03 2019 23:58:36 GMT+0800 (China Standard Time)

Same issue here. My implementation of LSD didn't even get the same scale, while the SNR's were comparable.

J.K · Answer 6 · Sun Jan 12 2020 20:48:40 GMT+0800 (China Standard Time)

Same issue here. My implementation of LSD didn't even get the same scale, while the SNR's were comparable.

Could you fix your problems ?

Vinayak Sharma · Answer 7 · Wed Nov 04 2020 14:14:35 GMT+0800 (China Standard Time)

def compute_LSD(x_hr,x_pr):
    def get_power(x):
        S = librosa.stft(x, 2048)
        S = np.log10(np.abs(S)**2 )
        return S    
    with np.errstate(divide='ignore'):
        S1 = get_power(x_hr)
        S2 = get_power(x_pr)
        lsd = np.mean(np.sqrt(np.mean((S1-S2)**2, axis=1)), axis=0)
    return lsd

Try using that snippet to calculate LSD. That seemed to help

Sawyer Birnbaum · Answer 8 · Wed Apr 07 2021 11:01:22 GMT+0800 (China Standard Time)

I'm happy to help if this is still an issue. FWIW, I've updated the code to run in Python 3 with later versions of Keras and Tensorflow.