DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
c-dafan opened this issue 2 months ago · comments
缓存CKVt在推理时,是否需要重新计算kCt,vCt?如果需要,在多卡推理的时候,每张卡需要完整的CKVt,这样需要存储多份吧