CyberZHG / keras-self-attention

Attention mechanism for processing sequential data that considers the context for each timestamp.

Home Page:https://pypi.org/project/keras-self-attention/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The gradient is missing sometimes

SUNBERG010 opened this issue · comments

Dear Zhao,
Recently I tried your self-attention layer for my work, I really appreciate it.

However, the decrease of gradient sometimes got stuck from the beginning; I tried some ways, however, it is not stable, could you please give some suggestions?

Best wishes,
Sunberg

same issue. please help.

commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.