Triton attention patch from Mistral

Question

Triton attention patch from Mistral

germanjke opened this issue 5 months ago · comments

Hi!

I can use it for LLaMa like

# Model
model:
  name: hf_causal_lm
  model_type: llama
  attention_patch_type: triton
  pretrained_model_name_or_path: /Llama-2-7B
  pretrained: true

Can I use it to Mistral or it's not supported yet?

Daniel King · Answer 1 · Fri Jan 12 2024 03:40:18 GMT+0800 (China Standard Time)

Its not supported, I recommend using flash attention 2. Please see https://github.com/mosaicml/llm-foundry/tree/main/scripts/train#flashattention.