Deep learning at the speed of light.
Home Page:https://luminalai.com
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
jafioti opened this issue 6 months ago · comments
End goal is to automatically discover flash attention, but lots of work still needs to be done on an IR. For now lets hand code a kernel and find-and-replace it