Windows CUDA Make chat problem
M0rtale opened this issue · comments
Алан commented
I am trying to use this solution on windows with CUDA with capability 8.6. I am running into an issue relating to the function LLaVAGenerate
not being compiled during linking, as shown in the screenshot below
Steps to replicate:
Environment:
- Visual Studio 2022 14.29.30133's cl.exe
- CUDA Took kit 12
- LLaMA2 13B AWQ int4 model using command
python tools/download_model.py --model LLaMA2_13B_chat_awq_int4 --QM QM_CUDA
- pthread package from vcpkg by directly linking the include and lib files in the project
- PATH:
/c/CUDA/v12/libnvvp:/c/CUDA/v12/bin:/c/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64:/ucrt64/bin:/usr/local/bin:/usr/bin:/bin:/c/Windows/System32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
Fixed a few issues with NUM_THREAD not being defined and tanhf not being defined, built using command make chat -j
my guess the only reference to LLaVAGenerate is in the non_cuda directory, maybe it is being omitted from compiled? Note compiling with CPU flag works fine and I can get output from the LLM