sgrvinod / a-PyTorch-Tutorial-to-Image-Captioning

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

Repository from Github https://github.comsgrvinod/a-PyTorch-Tutorial-to-Image-CaptioningRepository from Github https://github.comsgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

in models.py line 80 att1........ seems to be a duplicating calculating.

wbaiting opened this issue · comments

Thanks for you tutorial!
I found code in models.py:
line 80:

        att1 = self.encoder_att(encoder_out)  # (batch_size, num_pixels, attention_dim)

and line 203:

        attention_weighted_encoding, alpha = self.attention(encoder_out[:batch_size_t], h[:batch_size_t])

in every loop, att1 will be calculated repeatly. Is there something wrong with att1?
I saw the implementation based on tensorflow.https://github.com/jazzsaxmafia/show_attend_and_tell.tensorflow/blob/master/model_tensorflow.py

Instead of this code,

att1 = self.encoder_att(encoder_out)  # (batch_size, num_pixels, attention_dim)        
att2 = self.decoder_att(decoder_hidden)  # (batch_size, attention_dim)        
att = self.full_att(self.relu(att1 + att2.unsqueeze(1))).squeeze(2)

he implemented the attention layer with the code below:

    context_encode = context_encode + \                 
        tf.expand_dims(tf.matmul(h, self.hidden_att_W), 1) + \                 
        self.pre_att_b
    context_encode = tf.nn.tanh(context_encode)