oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Errors after installing locally built loaders

nktice opened this issue · comments

Describe the bug

With a fresh install of 1.15, Exllamav2_HF loads a model just fine... However, when I do a local install of exllamav2, then both it and the Exllamav2_HF loaders break ( errors below ).

To isolate the issue, I tested LlamaCpp_HF - from fresh install, and after the local install of exllamav2 the result is the same, llamacpp_HF doesn't work and gives the error below.

Attempting to use the llamacpp_HF loader gives this error

07:44:50-634907 ERROR    Could not load the model because a tokenizer in        
                         Transformers format was not found.                     
07:44:50-635599 INFO     Loaded                                                 
                         "cognitivecomputations_dolphin-2.9.4-llama3.1-8b-gguf" 
                         in 0.17 seconds.                                       
07:44:50-636142 INFO     LOADER: "llamacpp_HF"                                  
07:44:50-636566 INFO     TRUNCATION LENGTH: 131072                              
07:44:50-636985 INFO     INSTRUCTION TEMPLATE: "Custom (obtained from model     
                         metadata)"     

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

Here's the instructions I use to get a fresh exllamav2 from sources...

# cd ~/text-generation-webui/
git clone https://github.com/turboderp/exllamav2 repositories/exllamav2
cd repositories/exllamav2
pip install -r requirements.txt  --extra-index-url https://download.pytorch.org/whl/nightly/rocm6.2
pip install .   --extra-index-url https://download.pytorch.org/whl/nightly/rocm6.2
cd ../..

And after that here are the errors that I get...

Command to install LlamaCpp...

# remove old versions 
pip uninstall llama_cpp_python -y 
pip uninstall llama_cpp_python_cuda -y 
## install llama-cpp-python
git clone  --recurse-submodules  https://github.com/abetlen/llama-cpp-python.git repositories/llama-cpp-python 
cd repositories/llama-cpp-python
CC='/opt/rocm/llvm/bin/clang' CXX='/opt/rocm/llvm/bin/clang++' CFLAGS='-fPIC' CXXFLAGS='-fPIC' CMAKE_PREFIX_PATH='/opt/rocm' ROCM_PATH="/opt/rocm" HIP_PATH="/opt/rocm" CMAKE_ARGS="-GNinja -DLLAMA_HIPBLAS=ON -DLLAMA_AVX2=on " pip install --no-cache-dir .
cd ../.. 

These same commands worked for exllamav2 and llamacpp respectively on previous versions to get them up and running with TGW.

Screenshot

No response

Logs

Exllamav2 

07:35:33-392473 ERROR    Failed to load the model.                              
Traceback (most recent call last):
  File "/home/n/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/models.py", line 93, in load_model
    output = load_func_map[loader](model_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/models.py", line 306, in ExLlamav2_loader
    from modules.exllamav2 import Exllamav2Model
  File "/home/n/text-generation-webui/modules/exllamav2.py", line 5, in <module>
    from exllamav2 import (
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/model.py", line 35, in <module>
    from exllamav2.config import ExLlamaV2Config
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/config.py", line 5, in <module>
    from exllamav2.stloader import STFile, cleanup_stfiles
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/stloader.py", line 5, in <module>
    from exllamav2.ext import none_tensor, exllamav2_ext as ext_c
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/ext.py", line 291, in <module>
    ext_c = exllamav2_ext
            ^^^^^^^^^^^^^
NameError: name 'exllamav2_ext' is not defined

And it's the same thing with the built in exllamav2_HF

07:35:44-710182 ERROR    Failed to load the model.                              
Traceback (most recent call last):
  File "/home/n/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/models.py", line 93, in load_model
    output = load_func_map[loader](model_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/models.py", line 313, in ExLlamav2_HF_loader
    from modules.exllamav2_hf import Exllamav2HF
  File "/home/n/text-generation-webui/modules/exllamav2_hf.py", line 7, in <module>
    from exllamav2 import (
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/model.py", line 35, in <module>
    from exllamav2.config import ExLlamaV2Config
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/config.py", line 5, in <module>
    from exllamav2.stloader import STFile, cleanup_stfiles
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/stloader.py", line 5, in <module>
    from exllamav2.ext import none_tensor, exllamav2_ext as ext_c
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/ext.py", line 291, in <module>
    ext_c = exllamav2_ext
            ^^^^^^^^^^^^^
NameError: name 'exllamav2_ext' is not defined

After 'installing' LlamaCpp 'successfully' trying to use it to load a model gives the following output :

07:58:08-987760 ERROR    Failed to load the model.                              
Traceback (most recent call last):
  File "/home/n/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/models.py", line 93, in load_model
    output = load_func_map[loader](model_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/models.py", line 278, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/llamacpp_model.py", line 39, in from_pretrained
    LlamaCache = llama_cpp_lib().LlamaCache
                 ^^^^^^^^^^^^^^^
  File "/home/n/text-generation-webui/modules/llama_cpp_python_hijack.py", line 39, in llama_cpp_lib
    raise Exception(f"Cannot import `{lib_name}` because `{imported_module}` is already imported. Switching to a different version of llama-cpp-python currently requires a server restart.")
Exception: Cannot import `llama_cpp_cuda` because `llama_cpp` is already imported. Switching to a different version of llama-cpp-python currently requires a server restart.


### System Info

```shell
AMD Ryzen 9 5950x CPU 
2x Radeon 7900 XTX 
Ubuntu 24.04.1 
I've composed this guide for my install instructions - http://github.com/nktice/AMD-AI

same error for me, using Mac m1.
image

I'm getting the same issue with NameError: name 'exllamav2_ext' is not defined as well with ROCM 6.1.2 / Radeon MI100

I have the same problem with my MacBook Pro M1MAX:

Traceback (most recent call last):

File "/Users/lixian/Documents/GitProject/text-generation-webui/installer_files/env/lib/python3.11/importlib/metadata/init.py", line 563, in from_name

return next(cls.discover(name=name))

   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/Users/lixian/Documents/GitProject/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper

Same error with RTX 3090, W10

A clean install did the trick !