[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Home Page:https://hanlab.mit.edu/projects/spatten
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool