tlc-pack / libflash_attn

Standalone Flash Attention v2 kernel without libtorch dependency

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The flash attention v2 kernel has been extracted from the original repo into this repo to make it easier to integrate into a third-party project. In particular, the dependency on libtorch was removed.

As a consquence, dropout is not supported (since the original code uses randomness provided by libtorch). Also, only forward is supported for now.

Build with

mkdir build && cd build
cmake ..
make

It seems there are compilation issues if g++-9 is used as the host compiler. We confirmed that g++-11 works without issues.

About

Standalone Flash Attention v2 kernel without libtorch dependency

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:C++ 85.7%Language:Cuda 13.7%Language:CMake 0.6%