import fbgemm_gpu get a undefined symbol error: undefined symbol: _ZN3c104impl8GPUTrace13gpuTraceStateE
sea-of-freedom opened this issue · comments
OS: x86_64/Intel(R) Xeon(R) Gold 6130T CPU
CUDA: NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3
torch :1.12.1+cu113
python:3.9
GPU:NVIDIA Tesla P4
torchrec & fbgemm-gpu:0.3.2(pip install)
when i import fbgemm_gpu, I get a error:
xxxxx/lib/python3.9/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN3c104impl8GPUTrace13gpuTraceStateE
thanks for your help
Hi @666easyfuture, it seems that the library could be compiled with an incompatible tool version or could be out-of-date. You may try adding the path to python: sys.path.insert(0,"xxxxx/lib/python3.9/site-packages/fbgemm_gpu")
and see if that work. Otherwise, can you try to install the latest release of fbgemm-gpu in proper order? (See the instructions in order here: https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/docs/InstallationInstructions.md). Please let me know how it goes, thank you.
Hi @666easyfuture, it seems that the library could be compiled with an incompatible tool version or could be out-of-date. You may try adding the path to python:
sys.path.insert(0,"xxxxx/lib/python3.9/site-packages/fbgemm_gpu")
and see if that work. Otherwise, can you try to install the latest release of fbgemm-gpu in proper order? (See the instructions in order here: https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/docs/InstallationInstructions.md). Please let me know how it goes, thank you.
now, i use fbgemm-gpu==0.4.1, and add sys.path.insert(0,"xxxxx/lib/python3.9/site-packages/fbgemm_gpu") in my code.Then i get a another similar error:"xxxxxxx/lib/python3.9/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE"
Hi @666easyfuture, after looking into it, we do not actively support Pascal anymore. We maintain support from V100 onwards. Can you try building binary from source code? Please see https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/docs/BuildInstructions.md for instruction and set the compute capability/cuda_arch_list to match what you have. Thank you.
Please feel free to reopen the case if there are issues.