[CODE IMPROVEMENT] Flash_attn installation may be wrong if the wheel is cached

Question

[CODE IMPROVEMENT] Flash_attn installation may be wrong if the wheel is cached

pascal-pfeiffer opened this issue 2 months ago · comments

🔧 Proposed code refactoring

The flash_attn installation command in the Makefile is not fail proof if the wheel is cached and build with a different cuda version:

RuntimeError: Failed to import transformers.models.mistral.modeling_mistral because of the following error (look up to see its traceback):
libcudart.so.12: cannot open shared object file: No such file or directory

Motivation

Make installation fail proof