[Flash Attention] Sliding window
xzyaoi opened this issue · comments
Xiaozhe Yao commented
It seems now sliding window is supported in flash attention: https://github.com/Dao-AILab/flash-attention/blob/main/flash_attn/flash_attn_interface.py#L556
This seems to be important for mistral model, but also potentially useful for others. Let's see if we can/should integrate.