kuleshov / audio-super-res

Audio super resolution using neural networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using LSD objective method to evalate the upsample wav, I can't get the same results on the paper.

DYJ1111 opened this issue · comments

commented

I review the DNN network, using single speaker dataset in vctk, but when i evalate the upsample wav, I can't get the same results. It maybe my LSD code faults or what?
can you please tell me the split about train and validation datasets?
thank you.

commented

iam trying to evaluate it too, at the moment i couldnt get the same results too. It would be nice to have the original model.
How did you make the source code run? i had to merge two different repos and fix many problems. I wish i could just use the original and it worked out of the box..
Could you make the original version run?

commented

just run the train code, but can't test the model.

commented

how couldnt you test it?
do you have your own repostitory?

checkout my repository if you want i forked the project.

i run this to validate

python run.py eval --logname ./full_end_normalized_parmas_ps8.s8-audiounet.lr0.00300.1.g4.b32.d8192.r4.lr0.000300.1.g4.b8/model.ckpt-33 --out-label singlespeaker-out --wav-file-list ../data/vctk/speaker1/speaker1-val-files.txt --r 4

where full_end_normalized_parmas_ps8.s8-audiounet.lr0.00300.1.g4.b32.d8192.r4.lr0.000300.1.g4.b8/ was exported after training.

split and validation as far as I have understood is in those files:

audio-super-res\data\vctk\speaker1\speaker1-train-files.txt
audio-super-res\data\vctk\speaker1\speaker1-val-files.txt

I modified the fails to test the code and just used 10 files each. That will be faster.

commented

just run the train code, but can't test the model.

what code did you use? the main repo from kuleshov? what libraries did you use? i couldnt manage to run it with the libs in the description. I had to change from python 2.X to 3.X and change tensorflow and keras versions.

Same issue here. My implementation of LSD didn't even get the same scale, while the SNR's were comparable.

commented

Same issue here. My implementation of LSD didn't even get the same scale, while the SNR's were comparable.

Could you fix your problems ?

def compute_LSD(x_hr,x_pr):
    def get_power(x):
        S = librosa.stft(x, 2048)
        S = np.log10(np.abs(S)**2 )
        return S    
    with np.errstate(divide='ignore'):
        S1 = get_power(x_hr)
        S2 = get_power(x_pr)
        lsd = np.mean(np.sqrt(np.mean((S1-S2)**2, axis=1)), axis=0)
    return lsd

Try using that snippet to calculate LSD. That seemed to help

I'm happy to help if this is still an issue. FWIW, I've updated the code to run in Python 3 with later versions of Keras and Tensorflow.