k2-fsa / icefall

Home Page:https://k2-fsa.github.io/icefall/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Grad scale is too small

hoangtm-aimesoft opened this issue · comments

Hi all. I'm trying to train streaming zipformer on reazonspeech dataset (100h) with reazonspeech recipe and I got this error:
image
Here is my training args
image
Can someone help me ? Thank you very much !

@JinZr Could you have a look?

Hi, please update your local cloned repo to the latest master branch and try again.

Thanks!

@JinZr Thank you for your reply. I have pulled the latest master branch but I cannot see any update on the reazonspeech recipe. Can you explain it ?

Next time try pasting more of the error. That error appears if the model diverges but without seeing more output it can be hard to give more precise help other than "reduce the LR".

@JinZr I'm working with NVIDIA GeForce RTX 3090 with 24GB VRAM

Yes, I am also training a streaming model on ReazonSpeech. I trained with 3k hours of data using a single Nvidia 3090 GPU, with a learning rate of 0.025. The training proceeds normally and the convergence is quite good, with each epoch taking approximately 4 hours.