Self-attention before BiLSTM

Question

Self-attention before BiLSTM

katekats opened this issue 2 years ago · comments

Hi,

Is it possible to put the self-attention layer from the library after the input vector (word embeddings) and before the BiLSTM layer? How can the equations of the self-attention layer be re-written?