The official implementation for ICLR23 spotlight paper "DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion"
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
376498485 opened this issue 9 months ago · comments
如果将num_heads设为不等于1的值,在spatial-temporal/difformer.py中的第44行计算attention时,代码报错。