Transformer repo based on pytorch and d2l
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
V838Mon opened this issue 2 years ago · comments
Pardon me. Could you teach me how to implement the parallel calculation of the multi-head in the self attention block using multi for loop? I have to implement it using multi for loop. Thank you!