Error with Flash attention2 while testing the 34b demo

Question

Error with Flash attention2 while testing the 34b demo

zhangchunjie1999 opened this issue a month ago · comments

zhangchunjie1999 commented a month ago

When I input instructions to test the 34b model demo：
bash scripts/demo.sh MODELS/pllava-34b MODELS/pllava-34b
An error like the one below will occur：
raceback (most recent call last): File "/data/miniconda3/envs/pllava/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/data/miniconda3/envs/pllava/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/group/40010/esmezhang/PLLaVA/tasks/eval/demo/pllava_demo.py", line 246, in <module> chat = init_model(args) File "/group/40010/esmezhang/PLLaVA/tasks/eval/demo/pllava_demo.py", line 29, in init_model model, processor = load_pllava( File "/group/40010/esmezhang/PLLaVA/tasks/eval/model_utils.py", line 53, in load_pllava model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=torch.bfloat16) File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3552, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/group/40010/esmezhang/PLLaVA/models/pllava/modeling_pllava.py", line 295, in __init__ self.language_model = AutoModelForCausalLM.from_config(config.text_config, torch_dtype=config.torch_dtype, attn_implementation="flash_attention_2") File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 437, in from_config return model_class._from_config(config, **kwargs) File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1385, in _from_config config = cls._autoset_attn_implementation( File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1454, in _autoset_attn_implementation cls._check_and_enable_flash_attn_2( File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1557, in _check_and_enable_flash_attn_2 raise ImportError(f"{preface} Flash Attention 2 is not available. {install_message}") ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.

run transformers-cli env in the terminal and the output are：

`

transformers version: 4.40.2
Platform: Linux-4.14.105-1-tlinux3-0013-x86_64-with-glibc2.17
Python version: 3.10.14
Huggingface_hub version: 0.20.3
Safetensors version: 0.4.2
Accelerate version: 0.30.0
Accelerate config: not found
PyTorch version (GPU?): 2.2.1+cu118 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?: `

How can I fix it?

zhangchunjie1999 · Answer 1 · Fri May 10 2024 12:44:35 GMT+0800 (China Standard Time)

The hardware device is 2 pieces of 40g A100

gaowei · Answer 2 · Sat May 11 2024 15:07:22 GMT+0800 (China Standard Time)

The hardware device is 2 pieces of 40g A100

Hi, how did you fix this?

zhangchunjie1999 · Answer 3 · Sat May 11 2024 15:07:53 GMT+0800 (China Standard Time)

收到邮件了~ 祝安！

gaowei · Answer 4 · Sat May 11 2024 17:15:23 GMT+0800 (China Standard Time)

收到邮件了~ 祝安！

Hi, have you solved the problem? I encounterd the same error when run demo.py. If you have solved the problem, could you please share your solution, thank you.

zhangchunjie1999 · Answer 5 · Sat May 11 2024 17:43:44 GMT+0800 (China Standard Time)

已经解决了。你需要修改demo.sh，指定需要用的GPU。

…

---Original--- From: ***@***.***> Date: Sat, May 11, 2024 17:15 PM To: ***@***.***>; Cc: ***@***.***>;"State ***@***.***>; Subject: Re: [magic-research/PLLaVA] Error with Flash attention2 while testingthe 34b demo (Issue #27) 收到邮件了~   祝安！ Hi, have you solved the problem? I encounterd the same error when run demo.py. If you have solved the problem, could you please share your solution, thank you. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***>