baidu-research / ba-dls-deepspeech

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Getting Loss as NaN

Nishanksingla opened this issue · comments

Hi,

I am getting the loss as NaN after 110 iterations in epoch 1.
Can anyone please provide me a good configuration(like good values of hyperparameter or learning rate) that results in good model accuracy.

Also, I was thinking of using 5k iteration model given in repo for transfer learning. Can anyone please tell me how can I use transfer learning in training this model.
Any help would be much appreciated. :)

@Nishanksingla Did you train on Librispeech database?
If yes, I was train with the following parameters:
#layers: 7, #nodes: 1000, base_lr: 1e-4, clipnorm: 200
These parameters will make the loss converge.

But I have no idea how to use transfer learning. I would like to know, too.

@a00achild1 Thank you for the reply and sorry for replying so late.

Yes, I am using Librispeech database.
Do you know if it is possible to use all the GPU cards to train the model so that it will converge fast.
Like in caffe, there is "-gpu all" option for the training the model.

Using multiple GPUs isn't supported with the Theano backend through Keras. You could switch to tensorflow (see: https://www.tensorflow.org/tutorials/using_gpu#using_multiple_gpus)

@a00achild1 after how many iterations did you see convergence? I'm at 1400 iterations training on Librispeech-clean-100 (keras optimizer) with your parameters and my loss looks like this:

plot