oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

V1.15 Memory Access Fault - Radeon MI100 - ROCM 6.1.2

RSAStudioGames opened this issue · comments

Describe the bug

I am unable to load any model into VRAM. If it runs on CPU only, then it works without issue.
I can't even load a 1B parameter without getting an error.
I am unable to install Flash-Attention, but I turned it off in the model menu, however, it still tries to load flash attention for some reason.
I've reinstalled Text Gen Web UI about a dozen times with different versions of ROCM (5.6/5.7/6.12) and Text Gen Web UI (1.12/1.13/1.14/1.15)

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

I attempt to load the model as normal and it crashes with the same memory fault 100% of the time.

Screenshot

No response

Logs

19:43:11-546068 INFO     Loading "Llama-3.2-1B-Instruct-exl2"                                                           19:43:12-304425 WARNING  Failed to load flash-attention due to the following error:                                                                                                                                                             Traceback (most recent call last):
  File "/home/rsa/text-generation-webui/modules/exllamav2_hf.py", line 23, in <module>
    import flash_attn
ModuleNotFoundError: No module named 'flash_attn'
/home/rsa/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/generation/configuration_utils.py:611: UserWarning: `do_sample` is set to `False`. However, `min_p` is set to `0.0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `min_p`.
  warnings.warn(
Memory access fault by GPU node-1 (Agent handle: 0xc4f2d80) on address (nil). Reason: Page not present or supervisor privilege.

System Info

Ubuntu 22.04
3x Radeon Instinct MI100
AMD Epyc 9334
ROCM 6.1.2
Text Gen Web UI V1.15

I reinstalled EVERYTHING from scratch and that seemed to fix it. 🤷