thunlp / cost-optimal-gqa

The code for the paper "Cost-Optimal Grouped-Query Attention for Long-Context Modeling"

Home Page:https://arxiv.org/abs/2503.09579

Repository from Github https://github.comthunlp/cost-optimal-gqaRepository from Github https://github.comthunlp/cost-optimal-gqa

About

The code for the paper "Cost-Optimal Grouped-Query Attention for Long-Context Modeling"

https://arxiv.org/abs/2503.09579