pytorch / kineto

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error while building torch from source: ‘CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernelExC_v11060’ was not declared in this scope;

moonbucks opened this issue · comments

Hi team,
I keep see the following error while compiling torch from the source.
I use cuda 11.7

[4131/6657] Building CXX object third_party/kineto/libkineto/CMakeFiles/kineto_base.dir/src/CuptiActivityProfiler.cpp.o
FAILED: third_party/kineto/libkineto/CMakeFiles/kineto_base.dir/src/CuptiActivityProfiler.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -I/home/user/pytorch/cmake/../third_party/benchmark/include -I/home/user/pytorch/third_party/onnx -I/home/user/pytorch/build/third_party/onnx -I/home/user/pytorch/third_party/foxi -I/home/user/pytorch/build/third_party/foxi -I/home/user/pytorch/third_party/kineto/libkineto/include -I/home/user/pytorch/third_party/kineto/libkineto/src -I/home/user/pytorch/third_party/kineto/libkineto/third_party/dynolog -I/home/user/pytorch/third_party/fmt/include -I/home/user/pytorch/third_party/kineto/libkineto/third_party/dynolog/dynolog/src/ipcfabric -I/usr/local/cuda-11.7/extras/CUPTI/include -I/include/roctracer -I/opt/rocm/include -isystem /home/user/pytorch/build/third_party/gloo -isystem /home/user/pytorch/cmake/../third_party/gloo -isystem /home/user/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /home/user/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /home/user/pytorch/cmake/../third_party/googletest/googletest/include -isystem /home/user/pytorch/third_party/protobuf/src -isystem /home/user/miniconda3/envs/latest/include -isystem /home/user/pytorch/third_party/gemmlowp -isystem /home/user/pytorch/third_party/neon2sse -isystem /home/user/pytorch/third_party/XNNPACK/include -isystem /home/user/pytorch/third_party/ittapi/include -isystem /home/user/pytorch/cmake/../third_party/eigen -isystem /usr/local/cuda-11.7/include -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -O3 -DNDEBUG -DNDEBUG -std=c++17 -fPIC -DMKL_HAS_SBGEMM -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -DKINETO_NAMESPACE=libkineto -DFMT_HEADER_ONLY -DENABLE_IPC_FABRIC -std=c++17 -DHAS_CUPTI -MD -MT third_party/kineto/libkineto/CMakeFiles/kineto_base.dir/src/CuptiActivityProfiler.cpp.o -MF third_party/kineto/libkineto/CMakeFiles/kineto_base.dir/src/CuptiActivityProfiler.cpp.o.d -o third_party/kineto/libkineto/CMakeFiles/kineto_base.dir/src/CuptiActivityProfiler.cpp.o -c /home/user/pytorch/third_party/kineto/libkineto/src/CuptiActivityProfiler.cpp
In file included from /home/user/pytorch/third_party/kineto/libkineto/src/CuptiActivityProfiler.cpp:36:
/home/user/pytorch/third_party/kineto/libkineto/src/CuptiActivity.cpp: In member function ‘virtual bool libkineto::RuntimeActivity::flowStart() const’:
/home/user/pytorch/third_party/kineto/libkineto/src/CuptiActivity.cpp:248:25: error: ‘CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernelExC_v11060’ was not declared in this scope; did you mean ‘CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000’?
  248 |       activity_.cbid == CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernelExC_v11060;
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                         CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000

my command to compile the pytorch from the source: python3 setup.py develop.
nvcc version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

I installed all dependencies and synched subrepository to the latest as well as the main pytorch code.
Could you help solving this problem?
Thanks

Just took a look at this. There was a change to guard the use of that callback based on CUPTI API VERSION.
#792 that enables this above CUPTI API version >=17

Just checking the headers however.

CUDA 11.7.1 (and 11.7.0) do not have the CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernelExC_v11060 callback
https://gitlab.com/nvidia/headers/cuda-individual/cupti/-/blob/cuda-11.7.1/cupti_runtime_cbid.h?ref_type=tags
CUPTI API Version 17
https://gitlab.com/nvidia/headers/cuda-individual/cupti/-/blob/cuda-11.7.1/cupti_version.h?ref_type=tags#L104

And,

CUDA 11.8.0 does have CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernelExC_v11060 callback
https://gitlab.com/nvidia/headers/cuda-individual/cupti/-/blob/cuda-11.8.0/cupti_runtime_cbid.h?ref_type=tags#L440
CUPTI API Version 18
https://gitlab.com/nvidia/headers/cuda-individual/cupti/-/blob/cuda-11.8.0/cupti_version.h?ref_type=tags#L105

I'll add a fix to update the define. @moonbucks for a local fix you can change the '>=' part in /home/user/pytorch/third_party/kineto/libkineto/src/CuptiActivity.cpp:247 to 18 and it probably will work. Let us know

Changing 17 to 18 solved the problem. Thanks for your help!