huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Home Page:https://huggingface.co/docs/optimum/main/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NotImplementedError: The model type esm is not yet supported to be used with BetterTransformer.

NancyFyong opened this issue · comments

Feature request

I want to use flash attention to accelerate esm2's model, but it's not supported, and I'd like to implore the techies to adapt it, thanks!

Motivation

I've been using esm2 model for training recently, but the training is very slow and I want to use flash attention to speed it up.

Your contribution

I'm sorry I didn't.