thuml / Flowformer

About Code release for "Flowformer: Linearizing Transformers with Conservation Flows" (ICML 2022), https://arxiv.org/pdf/2202.06258.pdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Flowformer_NLP/flow_attention.py 在使用交叉注意力方式计算,且q 和kv的长度不同时会报错

wanpengxyzz opened this issue · comments

Flowformer_NLP/flow_attention.py 在使用交叉注意力方式计算,且q 和kv的长度不同时会报错

(1) incoming and outgoing flow

        sink_incoming = 1.0 / (torch.einsum("nld,nld->nl", q + 1e-6, k.cumsum(dim=1) + 1e-6))

image

您好,感谢关注
(1)我们在Flowformer_NLP提供的Flow-Attention是针对language modeling这个causal任务的,即QKV长度一致的情况
(2)如果您使用交叉注意力的话,则对应非causal版本,可以在这里找到https://github.com/thuml/Flowformer/blob/main/Flow_Attention.py

是的

好的,谢谢解答