mit-han-lab / spatten-llm

[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

Home Page:https://hanlab.mit.edu/projects/spatten

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mit-han-lab/spatten-llm Stargazers