A collection of memory efficient attention operators implemented in the Triton language.
Repository from Github https://github.comNonvolatileMemory/FlagAttention
A collection of memory efficient attention operators implemented in the Triton language.
Other