About the attention calculation in FASA

Question

About the attention calculation in FASA

csguoh opened this issue 10 months ago · comments

Hi, authors.
This work does inspire me a lot! I have a question about the Frequency domain-based self-attention solver
In this line, it seems that you directly use element-wise multiplication, however in classic attention, Matul (or @) is used.
I can not find any explanation in the paper, so could you give me some insight about this? thanks:D

ccyppl · Answer 1 · Tue Oct 10 2023 09:46:04 GMT+0800 (China Standard Time)

你好，作者们。
对于这里并没有使用矩阵乘法而是对应元素相乘。
我也对此问题感到困惑，期望得到解答，谢谢。

Hanzhou(Marco) Liu · Answer 2 · Fri Oct 13 2023 08:06:43 GMT+0800 (China Standard Time)

I guess the reason why the authors did the element-wise product is that multiplication in frequency domain is equivalent to conv in space domain.

Haodong Zhang · Answer 3 · Fri Oct 13 2023 11:04:11 GMT+0800 (China Standard Time)

But from the code, when the authors do the element-wise multiplication in this line, they have already come back to the spatial domain because of this line I think. The calculation is not in the frequency domain.

Hanzhou(Marco) Liu · Answer 4 · Fri Oct 13 2023 11:22:41 GMT+0800 (China Standard Time)

Just my opinion,
out = self.norm(out) # calculate the score matrix
output = v * out # multiply the v matrix by the score matrix