uzaymacar / attention-mechanisms

Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question to implement

seunghwan1228 opened this issue · comments

commented

First, Thank you for your work

By following your description, I'm trying to implement the attention layers each with tf 2.1.

I have a question that does the line 221 requires to be add a "squeeze" the inputs ?
attention_score = RepeatVector(source_hidden_states.shape[1])(tf.squeeze(attention_score))
because if i understood the full code correctly, the h_t is already expanded_dim and its attention score is (B, 1, H) before getting in the Repeatvector. However, when i feeding the (B, 1, H) to repeat vector, it rise an error as [repeat_vector is incompatible with the layer: expected ndim=2, found ndim=3.]

Thank you

attention_score = RepeatVector(source_hidden_states.shape[1])(attention_score) # (B, S*, H)