Triton attention patch from Mistral
germanjke opened this issue · comments
German Abramov commented
Hi!
I can use it for LLaMa like
# Model
model:
name: hf_causal_lm
model_type: llama
attention_patch_type: triton
pretrained_model_name_or_path: /Llama-2-7B
pretrained: true
Can I use it to Mistral or it's not supported yet?
Daniel King commented
Its not supported, I recommend using flash attention 2. Please see https://github.com/mosaicml/llm-foundry/tree/main/scripts/train#flashattention.