fkodom / grouped-query-attention-pytorch

(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fkodom/grouped-query-attention-pytorch Stargazers