build error
deeeed opened this issue · comments
Arthur commented
Hi, I have tried to build a on a clean env and not sure what I am missing.
Running on NVIDIA GeForce RTX 4080
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python server.py
Traceback (most recent call last):
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/ext.py", line 14, in <module>
import exllamav2_ext
ModuleNotFoundError: No module named 'exllamav2_ext'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
subprocess.run(
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/deeeed/dev/exui/server.py", line 11, in <module>
from backend.models import update_model, load_models, get_model_info, list_models, remove_model, load_model, unload_model, get_loaded_model
File "/home/deeeed/dev/exui/backend/models.py", line 5, in <module>
from exllamav2 import(
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/__init__.py", line 3, in <module>
from exllamav2.model import ExLlamaV2
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/model.py", line 17, in <module>
from exllamav2.cache import ExLlamaV2CacheBase
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/cache.py", line 2, in <module>
from exllamav2.ext import exllamav2_ext as ext_c
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/ext.py", line 126, in <module>
exllamav2_ext = load \
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1308, in load
return _jit_compile(
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'exllamav2_ext': [1/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/h_gemm.cu -o h_gemm.cuda.o
FAILED: h_gemm.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/h_gemm.cu -o h_gemm.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[2/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/lora.cu -o lora.cuda.o
FAILED: lora.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/lora.cu -o lora.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[3/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/pack_tensor.cu -o pack_tensor.cuda.o
FAILED: pack_tensor.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/pack_tensor.cu -o pack_tensor.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[4/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/quantize.cu -o quantize.cuda.o
FAILED: quantize.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/quantize.cu -o quantize.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[5/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_matrix.cu -o q_matrix.cuda.o
FAILED: q_matrix.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_matrix.cu -o q_matrix.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[6/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_attn.cu -o q_attn.cuda.o
FAILED: q_attn.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_attn.cu -o q_attn.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[7/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_mlp.cu -o q_mlp.cuda.o
FAILED: q_mlp.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_mlp.cu -o q_mlp.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[8/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_gemm.cu -o q_gemm.cuda.o
FAILED: q_gemm.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/q_gemm.cu -o q_gemm.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[9/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/rope.cu -o rope.cuda.o
FAILED: rope.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/rope.cu -o rope.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[10/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/rms_norm.cu -o rms_norm.cuda.o
FAILED: rms_norm.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/rms_norm.cu -o rms_norm.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[11/15] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/cache.cu -o cache.cuda.o
FAILED: cache.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 --compiler-options '-fPIC' -lineinfo -O3 -std=c++17 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cuda/cache.cu -o cache.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_89'
[12/15] c++ -MMD -MF sampling.o.d -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cpp/sampling.cpp -o sampling.o
[13/15] c++ -MMD -MF quantize_func.o.d -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/cpp/quantize_func.cpp -o quantize_func.o
[14/15] c++ -MMD -MF ext.o.d -DTORCH_EXTENSION_NAME=exllamav2_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/TH -isystem /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/deeeed/dev/exui/venv/lib/python3.10/site-packages/exllamav2/exllamav2_ext/ext.cpp -o ext.o
ninja: build stopped: subcommand failed.
(venv) ➜ exui git:(master) ✗
turboderp commented
It looks like something is misconfigured. compute_89
is the correct version for the 4080, so it looks like Torch has picked up on that, but maybe your CUDA version is too old?
You can try one of the prebuilt wheels here.
Also you can install PyTorch from here to make sure you're getting the right version for your setup.
Arthur commented
Thanks, it seems to work with the prebuilt wheels!