jllllll / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can you provide the whl of CUDAv11.8?

jimi202008 opened this issue · comments

Try to build the whl of CUDAv11.8, but always report an error. Can you provide the whl of CUDAv11.8.
Building wheels for collected packages: exllama
Building wheel for exllama (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [1544 lines of output]