Self-attention before BiLSTM
katekats opened this issue · comments
Katerina Katsarou commented
Hi,
Is it possible to put the self-attention layer from the library after the input vector (word embeddings) and before the BiLSTM layer? How can the equations of the self-attention layer be re-written?