ztxz16 / fastllm

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

make -j过程中报错

AIlaowong opened this issue · comments

用的cuda12.1,make -j过程中报错,整体安装过程如下:

(cuda12_1) root@I19359398490090128f:/hy-tmp# cd fastllm-master/
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# mkdir build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# cd build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# cmake .. -DUSE_CUDA=ON
-- The CXX compiler identification is GNU 9.4.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- USE_CUDA: ON
-- USE_TFACC: OFF
-- For legacy CUDA GPUs: OFF
-- PYTHON_API: OFF
-- BUILD_CLI: OFF
-- USE_SENTENCEPIECE: OFF
-- USE_IVCOREX: OFF
-- CMAKE_CXX_FLAGS: -pthread --std=c++17 -O2 -march=native
-- The CUDA compiler identification is NVIDIA 12.1.105
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Configuring done
-- Generating done
-- Build files have been written to: /hy-tmp/fastllm-master/build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# make -j
Scanning dependencies of target fastllm
Scanning dependencies of target fastllm_tools
[ 1%] Building CXX object CMakeFiles/fastllm.dir/src/fastllm.cpp.o
[ 3%] Building CXX object CMakeFiles/fastllm.dir/src/models/moss.cpp.o
[ 5%] Building CXX object CMakeFiles/fastllm.dir/src/models/llama.cpp.o
[ 6%] Building CXX object CMakeFiles/fastllm.dir/src/model.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/executor.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/chatglm.cpp.o
[ 12%] Building CXX object CMakeFiles/fastllm.dir/src/device.cpp.o
[ 13%] Building CXX object CMakeFiles/fastllm_tools.dir/src/fastllm.cpp.o
[ 15%] Building CXX object CMakeFiles/fastllm.dir/src/models/chatglm.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm_tools.dir/src/model.cpp.o
[ 20%] Building CXX object CMakeFiles/fastllm.dir/src/executor.cpp.o
[ 22%] Building CXX object CMakeFiles/fastllm.dir/src/models/glm.cpp.o
[ 24%] Building CXX object CMakeFiles/fastllm_tools.dir/src/device.cpp.o
[ 25%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/llama.cpp.o
[ 27%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevice.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moss.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 32%] Building CXX object CMakeFiles/fastllm.dir/src/models/deepseekv2.cpp.o
[ 34%] Building CXX object CMakeFiles/fastllm.dir/src/models/basellm.cpp.o
[ 36%] Building CXX object CMakeFiles/fastllm.dir/src/template.cpp.o
[ 37%] Building CXX object CMakeFiles/fastllm.dir/src/models/minicpm.cpp.o
[ 39%] Building CXX object CMakeFiles/fastllm_tools.dir/src/template.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm.dir/src/models/bert.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevice.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/basellm.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/glm.cpp.o
[ 48%] Building CXX object CMakeFiles/fastllm.dir/src/models/qwen.cpp.o
[ 50%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/qwen.cpp.o
[ 51%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm.dir/src/models/internlm2.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/minicpm.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/internlm2.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/bert.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm.dir/src/models/moe.cpp.o
[ 62%] Building CXX object CMakeFiles/fastllm_tools.dir/third_party/json11/json11.cpp.o
[ 63%] Building CUDA object CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/tools/src/pytools.cpp.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moe.cpp.o
[ 68%] Building CUDA object CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/third_party/json11/json11.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/deepseekv2.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevice.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevice.cpp.o

                                                          ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hdiv" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hmul" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "__hmul" is undefined
b[idx] = __hmul(a[idx], v);
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "__hmul" is undefined
a[idx] = __hadd(a[idx], __hmul(b[idx], alpha));
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t *baseB = B + p * m;
^

Remark: The warnings can be suppressed with "-diag-suppress "

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "__hsub" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=64]" at line 2941

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=256]" at line 2943

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^

12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: *** [CMakeFiles/fastllm_tools.dir/build.make:336: CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....

        function "__half::operator long long() const" (declared at line 247 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
        function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
        function "__half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
      b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
                                                                ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hdiv" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hmul" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "__hmul" is undefined
b[idx] = __hmul(a[idx], v);
b[idx] = __hmul(a[idx], v);
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "__hmul" is undefined
a[idx] = __hadd(a[idx], __hmul(b[idx], alpha));
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t *baseB = B + p * m;
^

Remark: The warnings can be suppressed with "-diag-suppress "

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "__hsub" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=64]" at line 2941

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=256]" at line 2943

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^

12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: *** [CMakeFiles/fastllm.dir/build.make:336: CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:279: CMakeFiles/fastllm.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:90: CMakeFiles/fastllm_tools.dir/all] Error 2
make: *** [Makefile:84: all] Error 2

一般是cmake没有识别到cuda架构,需要改一下CMakeLists.txt里面的CMAKE_CUDA_ARCHITECTURES,改成显卡对应的算力(一般是80, 90这样)

解决了嘛 遇到了同样的问题
我也是cuda12.1

我升级了cmake 重新编译就ok了