Errors after installing locally built loaders
nktice opened this issue · comments
Describe the bug
With a fresh install of 1.15, Exllamav2_HF loads a model just fine... However, when I do a local install of exllamav2, then both it and the Exllamav2_HF loaders break ( errors below ).
To isolate the issue, I tested LlamaCpp_HF - from fresh install, and after the local install of exllamav2 the result is the same, llamacpp_HF doesn't work and gives the error below.
Attempting to use the llamacpp_HF loader gives this error
07:44:50-634907 ERROR Could not load the model because a tokenizer in
Transformers format was not found.
07:44:50-635599 INFO Loaded
"cognitivecomputations_dolphin-2.9.4-llama3.1-8b-gguf"
in 0.17 seconds.
07:44:50-636142 INFO LOADER: "llamacpp_HF"
07:44:50-636566 INFO TRUNCATION LENGTH: 131072
07:44:50-636985 INFO INSTRUCTION TEMPLATE: "Custom (obtained from model
metadata)"
Is there an existing issue for this?
- I have searched the existing issues
Reproduction
Here's the instructions I use to get a fresh exllamav2 from sources...
# cd ~/text-generation-webui/
git clone https://github.com/turboderp/exllamav2 repositories/exllamav2
cd repositories/exllamav2
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/nightly/rocm6.2
pip install . --extra-index-url https://download.pytorch.org/whl/nightly/rocm6.2
cd ../..
And after that here are the errors that I get...
Command to install LlamaCpp...
# remove old versions
pip uninstall llama_cpp_python -y
pip uninstall llama_cpp_python_cuda -y
## install llama-cpp-python
git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git repositories/llama-cpp-python
cd repositories/llama-cpp-python
CC='/opt/rocm/llvm/bin/clang' CXX='/opt/rocm/llvm/bin/clang++' CFLAGS='-fPIC' CXXFLAGS='-fPIC' CMAKE_PREFIX_PATH='/opt/rocm' ROCM_PATH="/opt/rocm" HIP_PATH="/opt/rocm" CMAKE_ARGS="-GNinja -DLLAMA_HIPBLAS=ON -DLLAMA_AVX2=on " pip install --no-cache-dir .
cd ../..
These same commands worked for exllamav2 and llamacpp respectively on previous versions to get them up and running with TGW.
Screenshot
No response
Logs
Exllamav2
07:35:33-392473 ERROR Failed to load the model.
Traceback (most recent call last):
File "/home/n/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/models.py", line 93, in load_model
output = load_func_map[loader](model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/models.py", line 306, in ExLlamav2_loader
from modules.exllamav2 import Exllamav2Model
File "/home/n/text-generation-webui/modules/exllamav2.py", line 5, in <module>
from exllamav2 import (
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/__init__.py", line 3, in <module>
from exllamav2.model import ExLlamaV2
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/model.py", line 35, in <module>
from exllamav2.config import ExLlamaV2Config
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/config.py", line 5, in <module>
from exllamav2.stloader import STFile, cleanup_stfiles
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/stloader.py", line 5, in <module>
from exllamav2.ext import none_tensor, exllamav2_ext as ext_c
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/ext.py", line 291, in <module>
ext_c = exllamav2_ext
^^^^^^^^^^^^^
NameError: name 'exllamav2_ext' is not defined
And it's the same thing with the built in exllamav2_HF
07:35:44-710182 ERROR Failed to load the model.
Traceback (most recent call last):
File "/home/n/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/models.py", line 93, in load_model
output = load_func_map[loader](model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/models.py", line 313, in ExLlamav2_HF_loader
from modules.exllamav2_hf import Exllamav2HF
File "/home/n/text-generation-webui/modules/exllamav2_hf.py", line 7, in <module>
from exllamav2 import (
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/__init__.py", line 3, in <module>
from exllamav2.model import ExLlamaV2
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/model.py", line 35, in <module>
from exllamav2.config import ExLlamaV2Config
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/config.py", line 5, in <module>
from exllamav2.stloader import STFile, cleanup_stfiles
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/stloader.py", line 5, in <module>
from exllamav2.ext import none_tensor, exllamav2_ext as ext_c
File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/ext.py", line 291, in <module>
ext_c = exllamav2_ext
^^^^^^^^^^^^^
NameError: name 'exllamav2_ext' is not defined
After 'installing' LlamaCpp 'successfully' trying to use it to load a model gives the following output :
07:58:08-987760 ERROR Failed to load the model.
Traceback (most recent call last):
File "/home/n/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/models.py", line 93, in load_model
output = load_func_map[loader](model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/models.py", line 278, in llamacpp_loader
model, tokenizer = LlamaCppModel.from_pretrained(model_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/llamacpp_model.py", line 39, in from_pretrained
LlamaCache = llama_cpp_lib().LlamaCache
^^^^^^^^^^^^^^^
File "/home/n/text-generation-webui/modules/llama_cpp_python_hijack.py", line 39, in llama_cpp_lib
raise Exception(f"Cannot import `{lib_name}` because `{imported_module}` is already imported. Switching to a different version of llama-cpp-python currently requires a server restart.")
Exception: Cannot import `llama_cpp_cuda` because `llama_cpp` is already imported. Switching to a different version of llama-cpp-python currently requires a server restart.
### System Info
```shell
AMD Ryzen 9 5950x CPU
2x Radeon 7900 XTX
Ubuntu 24.04.1
I've composed this guide for my install instructions - http://github.com/nktice/AMD-AI
I'm getting the same issue with NameError: name 'exllamav2_ext' is not defined
as well with ROCM 6.1.2 / Radeon MI100
I have the same problem with my MacBook Pro M1MAX:
Traceback (most recent call last):
File "/Users/lixian/Documents/GitProject/text-generation-webui/installer_files/env/lib/python3.11/importlib/metadata/init.py", line 563, in from_name
return next(cls.discover(name=name))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/lixian/Documents/GitProject/text-generation-webui/modules/ui_model_menu.py", line 232, in load_model_wrapper
Same error with RTX 3090, W10
A clean install did the trick !