Questions about frame-level self-attention

Question

Questions about frame-level self-attention

create7859 opened this issue 10 months ago · comments

Hello.
In the ablation study section of the paper, there were results of performing self-attention at the frame level.
When you implemented this in code, did you use a for loop to apply the transformer to frame tokens one by one, or did you have another method to calculate them all at once?
I'm curious about your approach.