tomaarsen / attention_sinks

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

Home Page:https://huggingface.co/blog/tomaarsen/attention-sinks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tomaarsen/attention_sinks Stargazers