Atinoda / text-generation-webui-docker

Docker variants of oobabooga's text-generation-webui, including pre-built images.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not running Apple M1: third_party/cuda/bin/ptxas --version died with <Signals.SIGTRAP: 5>.

AIWintermuteAI opened this issue · comments

It looks like the default version (and cpu too) do not run on M1. I tried specifying - EXTRA_LAUNCH_ARGS="--listen --verbose --loader llama.cpp", that however did not have any effect.
It would be nice to have it fixed, since llama.cpp actually runs pretty well with smaller models on M1.
Any pointers?

[+] Building 0.0s (0/0)                                                                                         
[+] Running 2/0
 ✔ Container text-generation-webui                                                                                                                                             Recreated0.1s 
 ! text-generation-webui-docker The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested 0.0s 
Attaching to text-generation-webui
text-generation-webui  | === Running text-generation-webui variant: 'DEFAULT' snapshot-2023-10-15 ===
text-generation-webui  | === (This version is 11 commits behind origin main) ===
text-generation-webui  | === Image build date: 2023-10-18 11:30:52 ===
text-generation-webui  | 2023-10-23 18:27:36 WARNING:
text-generation-webui  | You are potentially exposing the web UI to the entire internet without any access password.
text-generation-webui  | You can create one with the "--gradio-auth" flag like this:
text-generation-webui  | 
text-generation-webui  | --gradio-auth username:password
text-generation-webui  | 
text-generation-webui  | Make sure to replace username:password with your own.
text-generation-webui  | /venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
text-generation-webui  |   warn("The installed version of bitsandbytes was compiled without GPU support. "
text-generation-webui  | /venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
text-generation-webui  | Traceback (most recent call last):
text-generation-webui  |   File "/app/server.py", line 31, in <module>
text-generation-webui  |     from modules import (
text-generation-webui  |   File "/app/modules/training.py", line 21, in <module>
text-generation-webui  |     from peft import (
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/peft/__init__.py", line 22, in <module>
text-generation-webui  |     from .auto import (
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/peft/auto.py", line 31, in <module>
text-generation-webui  |     from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/peft/mapping.py", line 23, in <module>
text-generation-webui  |     from .peft_model import (
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/peft/peft_model.py", line 38, in <module>
text-generation-webui  |     from .tuners import (
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/peft/tuners/__init__.py", line 21, in <module>
text-generation-webui  |     from .lora import LoraConfig, LoraModel
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/peft/tuners/lora.py", line 45, in <module>
text-generation-webui  |     import bitsandbytes as bnb
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/bitsandbytes/__init__.py", line 16, in <module>
text-generation-webui  |     from .nn import modules
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/bitsandbytes/nn/__init__.py", line 6, in <module>
text-generation-webui  |     from .triton_based_modules import SwitchBackLinear, SwitchBackLinearGlobal, SwitchBackLinearVectorwise, StandardLinear
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/bitsandbytes/nn/triton_based_modules.py", line 8, in <module>
text-generation-webui  |     from bitsandbytes.triton.dequantize_rowwise import dequantize_rowwise
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/bitsandbytes/triton/dequantize_rowwise.py", line 10, in <module>
text-generation-webui  |     import triton
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/__init__.py", line 20, in <module>
text-generation-webui  |     from .compiler import compile, CompilationError
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/compiler/__init__.py", line 1, in <module>
text-generation-webui  |     from .compiler import CompiledKernel, compile, instance_descriptor
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 27, in <module>
text-generation-webui  |     from .code_generator import ast_to_ttir
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/compiler/code_generator.py", line 8, in <module>
text-generation-webui  |     from .. import language
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/language/__init__.py", line 4, in <module>
text-generation-webui  |     from . import math
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/language/math.py", line 4, in <module>
text-generation-webui  |     from . import core
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/language/core.py", line 1376, in <module>
text-generation-webui  |     def minimum(x, y):
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 542, in jit
text-generation-webui  |     return decorator(fn)
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 534, in decorator
text-generation-webui  |     return JITFunction(
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 433, in __init__
text-generation-webui  |     self.run = self._make_launcher()
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 388, in _make_launcher
text-generation-webui  |     scope = {"version_key": version_key(),
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 120, in version_key
text-generation-webui  |     ptxas = path_to_ptxas()[0]
text-generation-webui  |   File "/venv/lib/python3.10/site-packages/triton/common/backend.py", line 114, in path_to_ptxas
text-generation-webui  |     result = subprocess.check_output([ptxas_bin, "--version"], stderr=subprocess.STDOUT)
text-generation-webui  |   File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
text-generation-webui  |     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
text-generation-webui  |   File "/usr/lib/python3.10/subprocess.py", line 526, in run
text-generation-webui  |     raise CalledProcessError(retcode, process.args,
text-generation-webui  | subprocess.CalledProcessError: Command '['/venv/lib/python3.10/site-packages/triton/common/../third_party/cuda/bin/ptxas', '--version']' died with <Signals.SIGTRAP: 5>.
text-generation-webui exited with code 1

I cannot properly support Mac OS because I don't have any Apple Silicon at the moment. I absolutely intend to support it in future though, because the unified memory architecture is monster for LLM inference work!

Can you please check #22 and also leave a comment there if you get things working? I'm going to close this issue and direct the Mac stuff to that thread - hopefully it'll be useful to you and other people as a central reference for now.