Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Home Page:https://arxiv.org/abs/2402.19427
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool