implement a non-Cuda flash attention module
by321 opened this issue · comments
by321 commented
The current flash attention module by Hazy Research is Cuda-only, so this limits this repo to Cuda-only too. I suggest writing a separate flash attention module for computers without Nvidia video card. This current module can still be used if Nvidia card is present.
A couple of people have raised this issue with Hazy Research, but they said they're focused on Cuda only, and are not interested in writing a non-Cuda version.
Ayush Mangal commented
Added a pr #37 to avoid using flash Attention