The gradient is missing sometimes
SUNBERG010 opened this issue · comments
SUNBERG010 commented
Dear Zhao,
Recently I tried your self-attention layer for my work, I really appreciate it.
However, the decrease of gradient sometimes got stuck from the beginning; I tried some ways, however, it is not stable, could you please give some suggestions?
Best wishes,
Sunberg
Shreeyash Geda commented
same issue. please help.
stale commented
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.