kracwarlock / action-recognition-visual-attention

Action recognition using soft attention based deep recurrent neural networks

Home Page:http://www.cs.toronto.edu/~shikhar/projects/action-recognition-attention

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Location softmax

pmorerio opened this issue · comments

Hi, thanks for sharing the code!
It looks like the location softmax implemented in the conditional lstm is not the one you describe in the paper 'Action Recognition with Visual Attention' (eq. 4), but rather the one described in eqns. 4,5,6,9 in 'Show, Attend and Tell: Neural Image Caption Generation with Visual Attention'. Could you please comment on that? Really appreciate, thanks!

Hi @kracwarlock ,

I have a question, in your master thesis, in the equation 2.4, you mention that "Wi are the weights mapping to the ith element of the location softmax", I dont fully undertand this, can you explain it a little more ? thank you !

@pmorerio the location softmax in my case is presented in (4) and in their case it is (4)+(5)
(6) and (9) hold true for both papers
the results in our paper are based on the equation in our paper's (4)
however the show attend tell paper's (4)+(5) seem to produce better results since they look at the current frame's features as well

@GerardoHH i just meant that W_i are the weights for l_{t,i} (ith element of the location softmax).

@kracwarlock Thanks a lot for your answer. I must say that this does not look the case since in the code the location softmax is calculated base on both current features AND previous hidden state. The two contributions are then added together here and used to activate a softmax layer.