feifeibear / long-context-attention

Sequence Parallel Attention for Long Context LLM Model Training and Inference

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

请教下,混合使用这两种方案会有哪些优势呢?技术出发点有介绍吗?

nullnonenilNULL opened this issue · comments

https://zhuanlan.zhihu.com/p/689067888
看这个文章吧

Also refer to this issue. #40