CNugteren / CLBlast

Tuned OpenCL BLAS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

llama.cpp hangs when defining cl_uint variable

JohannesGaessler opened this issue · comments

I am using llama.cpp which has implemented CLBlast acceleration. However, when I wanted to try it out the program unexpectedly freezes during initialization when a cl_uint variable is defined. The relevant line of code is https://github.com/ggerganov/llama.cpp/blob/master/ggml-opencl.c#L211 . I am assuming that this is a CLBlast bug because the code prior to the call is very simple. I also tried running the clblast_tuner_xgemm binary produced by CLBlast and it also freezed after printing:

* (1/4) Tuning main GEMM kernel (GEMMK == 0) for fixed set of parameters

* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -precision 32 (single) [=default]
    -m 1024 [=default]
    -n 1024 [=default]
    -k 1024 [=default]
    -alpha 2.00 [=default]
    -beta 2.00 [=default]
    -fraction 1.00 [=default]
    -runs 2 [=default]
    -max_l2_norm 0.00 [=default]

When I compile CLBlast with -DCUDA=ON -DOPENCL=OFF the clblast_tuner_xgemm binary works as expected. Unfortunately this is not an option for llama.cpp though.

My CUDA version is 12.1.0-3 and I am using a GTX 1070. I get the bug both when I install CLBlast via the package manager (6.3.0-1-MANJARO, Arch-based) and when I install CLBlast manually. The console outputs from cmake (v3.26.3) and make (v4.4.1) are:

CMake Deprecation Warning at CMakeLists.txt:12 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- The C compiler identification is GNU 12.2.1
-- The CXX compiler identification is GNU 12.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Building CLBlast with OpenCL API (default)
-- Found OpenCL: /opt/cuda/include  
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/johannesg/Projects/CLBlast
[  0%] Building CXX object CMakeFiles/tuners_common_library.dir/src/utilities/compile.cpp.o
[  2%] Building CXX object CMakeFiles/tuners_common_library.dir/src/utilities/utilities.cpp.o
[  2%] Building CXX object CMakeFiles/tuners_common_library.dir/src/utilities/clblast_exceptions.cpp.o
[  3%] Building CXX object CMakeFiles/tuners_common_library.dir/src/tuning/configurations.cpp.o
[  4%] Building CXX object CMakeFiles/tuners_common_library.dir/src/tuning/tuning.cpp.o
[  4%] Building CXX object CMakeFiles/clblast.dir/src/database/database.cpp.o
[  5%] Building CXX object CMakeFiles/tuners_common_library.dir/src/kernel_preprocessor.cpp.o
[  6%] Building CXX object CMakeFiles/tuners_common_library.dir/src/utilities/timing.cpp.o
[  8%] Building CXX object CMakeFiles/clblast.dir/src/utilities/compile.cpp.o
[  8%] Building CXX object CMakeFiles/clblast.dir/src/routines/common.cpp.o
[  9%] Building CXX object CMakeFiles/clblast.dir/src/utilities/clblast_exceptions.cpp.o
[ 10%] Building CXX object CMakeFiles/clblast.dir/src/utilities/timing.cpp.o
[ 10%] Building CXX object CMakeFiles/clblast.dir/src/api_common.cpp.o
[ 11%] Building CXX object CMakeFiles/clblast.dir/src/cache.cpp.o
[ 12%] Building CXX object CMakeFiles/clblast.dir/src/utilities/utilities.cpp.o
[ 13%] Building CXX object CMakeFiles/clblast.dir/src/kernel_preprocessor.cpp.o
[ 14%] Building CXX object CMakeFiles/clblast.dir/src/routine.cpp.o
[ 15%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xinvert.cpp.o
[ 16%] Building CXX object CMakeFiles/clblast.dir/src/tuning/configurations.cpp.o
[ 16%] Building CXX object CMakeFiles/clblast.dir/src/clblast.cpp.o
[ 17%] Building CXX object CMakeFiles/clblast.dir/src/clblast_c.cpp.o
[ 18%] Building CXX object CMakeFiles/clblast.dir/src/tuning/tuning_api.cpp.o
[ 19%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xswap.cpp.o
[ 20%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xscal.cpp.o
[ 21%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xcopy.cpp.o
[ 21%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xaxpy.cpp.o
[ 22%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xdot.cpp.o
[ 23%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xdotu.cpp.o
[ 24%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xdotc.cpp.o
[ 25%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xnrm2.cpp.o
[ 26%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xasum.cpp.o
[ 26%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xgemv.cpp.o
[ 27%] Building CXX object CMakeFiles/clblast.dir/src/routines/level1/xamax.cpp.o
[ 28%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xgbmv.cpp.o
[ 29%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xhemv.cpp.o
[ 30%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xhbmv.cpp.o
[ 31%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xhpmv.cpp.o
[ 32%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xsymv.cpp.o
[ 32%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xsbmv.cpp.o
[ 33%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xspmv.cpp.o
[ 34%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xtrmv.cpp.o
[ 35%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xtbmv.cpp.o
[ 36%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xtpmv.cpp.o
[ 37%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xtrsv.cpp.o
[ 37%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xger.cpp.o
[ 38%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xgeru.cpp.o
[ 39%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xgerc.cpp.o
[ 40%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xher.cpp.o
[ 41%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xhpr.cpp.o
[ 42%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xher2.cpp.o
[ 42%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xhpr2.cpp.o
[ 43%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xsyr.cpp.o
[ 44%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xspr.cpp.o
[ 45%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xsyr2.cpp.o
[ 46%] Building CXX object CMakeFiles/clblast.dir/src/routines/level2/xspr2.cpp.o
[ 47%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xgemm.cpp.o
[ 48%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xsymm.cpp.o
[ 48%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xhemm.cpp.o
[ 49%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xsyrk.cpp.o
[ 50%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xherk.cpp.o
[ 51%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xsyr2k.cpp.o
[ 52%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xher2k.cpp.o
[ 53%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xtrmm.cpp.o
[ 53%] Building CXX object CMakeFiles/clblast.dir/src/routines/level3/xtrsm.cpp.o
[ 54%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xhad.cpp.o
[ 55%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xomatcopy.cpp.o
[ 56%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xim2col.cpp.o
[ 57%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xcol2im.cpp.o
[ 58%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xconvgemm.cpp.o
[ 58%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xaxpybatched.cpp.o
[ 59%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xgemmbatched.cpp.o
[ 60%] Building CXX object CMakeFiles/clblast.dir/src/routines/levelx/xgemmstridedbatched.cpp.o
[ 61%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/copy/copy.cpp.o
[ 61%] Built target tuners_common_library
[ 62%] Building CXX object CMakeFiles/clblast_tuner_copy_fast.dir/src/tuning/kernels/copy_fast.cpp.o
[ 63%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/pad/pad.cpp.o
[ 64%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/padtranspose/padtranspose.cpp.o
[ 64%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/transpose/transpose.cpp.o
[ 65%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xaxpy/xaxpy.cpp.o
[ 66%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xdot/xdot.cpp.o
[ 67%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xgemm/xgemm.cpp.o
[ 68%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xgemm_direct/xgemm_direct.cpp.o
[ 69%] Building CXX object CMakeFiles/clblast_tuner_copy_pad.dir/src/tuning/kernels/copy_pad.cpp.o
[ 70%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xgemv/xgemv.cpp.o
[ 71%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xgemv_fast/xgemv_fast.cpp.o
[ 71%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xgemv_fast_rot/xgemv_fast_rot.cpp.o
[ 72%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xger/xger.cpp.o
[ 73%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/invert/invert.cpp.o
[ 74%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/gemm_routine/gemm_routine.cpp.o
[ 75%] Linking CXX executable clblast_tuner_copy_fast
[ 76%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/trsv_routine/trsv_routine.cpp.o
[ 77%] Building CXX object CMakeFiles/clblast.dir/src/database/kernels/xconvgemm/xconvgemm.cpp.o
[ 78%] Building CXX object CMakeFiles/clblast_tuner_transpose_fast.dir/src/tuning/kernels/transpose_fast.cpp.o
[ 78%] Built target clblast_tuner_copy_fast
[ 79%] Building CXX object CMakeFiles/clblast_tuner_transpose_pad.dir/src/tuning/kernels/transpose_pad.cpp.o
[ 80%] Building CXX object CMakeFiles/clblast_tuner_xaxpy.dir/src/tuning/kernels/xaxpy.cpp.o
[ 81%] Building CXX object CMakeFiles/clblast_tuner_xdot.dir/src/tuning/kernels/xdot.cpp.o
[ 82%] Building CXX object CMakeFiles/clblast_tuner_xger.dir/src/tuning/kernels/xger.cpp.o
[ 83%] Building CXX object CMakeFiles/clblast_tuner_xgemm.dir/src/tuning/kernels/xgemm.cpp.o
[ 83%] Building CXX object CMakeFiles/clblast_tuner_xgemm_direct.dir/src/tuning/kernels/xgemm_direct.cpp.o
[ 84%] Building CXX object CMakeFiles/clblast_tuner_xgemv.dir/src/tuning/kernels/xgemv.cpp.o
[ 85%] Building CXX object CMakeFiles/clblast_tuner_invert.dir/src/tuning/kernels/invert.cpp.o
[ 85%] Building CXX object CMakeFiles/clblast_tuner_xconvgemm.dir/src/tuning/kernels/xconvgemm.cpp.o
[ 86%] Linking CXX executable clblast_tuner_copy_pad
[ 86%] Built target clblast_tuner_copy_pad
[ 87%] Linking CXX executable clblast_tuner_transpose_fast
[ 87%] Built target clblast_tuner_transpose_fast
[ 88%] Linking CXX executable clblast_tuner_transpose_pad
[ 88%] Built target clblast_tuner_transpose_pad
[ 89%] Linking CXX executable clblast_tuner_xaxpy
[ 89%] Built target clblast_tuner_xaxpy
[ 90%] Linking CXX executable clblast_tuner_xdot
[ 91%] Linking CXX executable clblast_tuner_xger
[ 92%] Linking CXX executable clblast_tuner_xconvgemm
[ 93%] Linking CXX executable clblast_tuner_xgemv
[ 93%] Linking CXX executable clblast_tuner_invert
[ 93%] Built target clblast_tuner_xdot
[ 93%] Built target clblast_tuner_xger
[ 93%] Built target clblast_tuner_xconvgemm
[ 93%] Built target clblast_tuner_xgemv
[ 93%] Built target clblast_tuner_invert
[ 94%] Linking CXX executable clblast_tuner_xgemm_direct
[ 94%] Built target clblast_tuner_xgemm_direct
[ 95%] Linking CXX executable clblast_tuner_xgemm
[ 95%] Built target clblast_tuner_xgemm
[ 95%] Linking CXX shared library libclblast.so
[ 95%] Built target clblast
[ 97%] Building CXX object CMakeFiles/clblast_tuner_routine_xtrsv.dir/src/tuning/routines/xtrsv.cpp.o
[ 97%] Building CXX object CMakeFiles/clblast_tuner_routine_xgemm.dir/src/tuning/routines/xgemm.cpp.o
[ 99%] Building CXX object CMakeFiles/clblast_tuner_routine_xtrsv.dir/test/test_utilities.cpp.o
[ 99%] Building CXX object CMakeFiles/clblast_tuner_routine_xgemm.dir/test/test_utilities.cpp.o
[ 99%] Linking CXX executable clblast_tuner_routine_xtrsv
[ 99%] Built target clblast_tuner_routine_xtrsv
[100%] Linking CXX executable clblast_tuner_routine_xgemm
[100%] Built target clblast_tuner_routine_xgemm

Thank you for reporting the issue. Are you sure this is a CLBlast issue? I don't see any calls to CLBlast in that ggml_cl_init function that you are referring, just normal OpenCL calls. Perhaps your OpenCL set-up is not correct? I would double check that first, e.g. run clinfo, and make sure the output is as expected.

Thank you for the help. clinfo seems to have the same issue so this indeed seems to be an issue with my OpenCL setup.