implement a non-Cuda flash attention module

Question

implement a non-Cuda flash attention module

by321 opened this issue a year ago · comments

The current flash attention module by Hazy Research is Cuda-only, so this limits this repo to Cuda-only too. I suggest writing a separate flash attention module for computers without Nvidia video card. This current module can still be used if Nvidia card is present.

A couple of people have raised this issue with Hazy Research, but they said they're focused on Cuda only, and are not interested in writing a non-Cuda version.

Ayush Mangal · Answer 1 · Wed Jun 14 2023 20:11:23 GMT+0800 (China Standard Time)

Added a pr #37 to avoid using flash Attention