1ytic / warp-rnnt

CUDA-Warp RNN-Transducer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

warning that forward/backward mismatch

maxwellzh opened this issue · comments

The warning messages occasionally thrown out during training,

...
WARNING: sample 10 [81, 25] has a forward/backward mismatch -0.000083 / -0.000083
...
WARNING: sample 11 [62, 28] has a forward/backward mismatch -0.000188 / -0.000188

The source code makes the judgement of whether abs(a-b)/abs(max(a,b)) > 0.001.
I'm sorry that I have difficulty reading the core_gather.cu.
Could you explain more details about the function kernel_fill_costs() and alphas, betas?

These variables comes from the classical forward/backward algorithm. alphas and betas must be equal with the small measurement errors. For some reason the values looks very small. Please check that you provide the right input data.

If this is error related to the input data, it should repeat every epoch in training, but at the beginning, no warning is thrown.
And as you can see, all the warnings are generated with small values, so I wonder whether if there is something that leads to under flow computation.

I can't remember from my practice that these values was so small. Maybe we should add additional check not only for ratio, but also for abs value as well. Fell free to change this condition check and recompile the package from the source.