FBGEMM_GPU Build Issue
rkindi opened this issue · comments
I am unable to build fbgemm_gpu due to the following error:
/usr/include/c++/7/bits/hashtable.h:268:7: error: static assertion failed: Cache the hash code or qualify your functors involved in hash code and bucket index computation with noexcept
static_assert(noexcept(declval<const __hash_code_base_access&>()
^~~~~~~~~~~~~
Some relevant environment variables:
$ echo $TORCH_CUDA_ARCH_LIST, $CUDA_HOME, $CUDACXX
8.0, /usr/local/cuda-11.1, /usr/local/cuda-11.1/bin/nvcc
My conda env:
name: my_fbgemm_setup
channels:
- pytorch-nightly
- bottler
- iopath
- nvidia
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=1_llvm
- blas=1.0=mkl
- ca-certificates=2021.10.26=h06a4308_2
- colorama=0.4.4=pyh9f0ad1d_0
- cudatoolkit=11.1.74=h6bb024c_0
- iopath=0.1.9=py38
- jinja2=3.0.3=pyhd8ed1ab_0
- ld_impl_linux-64=2.36.1=hea4e1c9_2
- libffi=3.3=h58526e2_2
- libgcc-ng=11.2.0=h1d223b6_11
- libstdcxx-ng=11.2.0=he4da1e4_11
- libuv=1.42.0=h7f98852_0
- libzlib=1.2.11=h36c2ea0_1013
- llvm-openmp=12.0.1=h4bd325d_1
- markupsafe=2.0.1=py38h497a2fe_1
- mkl=2021.4.0=h8d4b97c_729
- mkl-service=2.4.0=py38h7f8727e_0
- mkl_fft=1.3.1=py38hd3c417c_0
- mkl_random=1.2.2=py38h51133e4_0
- ncurses=6.2=h58526e2_4
- numpy=1.21.2=py38h20f2e39_0
- numpy-base=1.21.2=py38h79a1101_0
- nvidiacub=1.10.0=0
- openssl=1.1.1l=h7f8727e_0
- pip=21.3.1=pyhd8ed1ab_0
- portalocker=2.3.2=py38h578d9bd_1
- python=3.8.6=hffdb5ce_5_cpython
- python_abi=3.8=2_cp38
- pytorch=1.11.0.dev20211115=py3.8_cuda11.1_cudnn8.0.5_0
- pytorch-mutex=1.0=cuda
- readline=8.1=h46c0cb4_0
- setuptools=59.1.0=py38h578d9bd_0
- six=1.16.0=pyhd3eb1b0_0
- sqlite=3.36.0=h9cd32fc_2
- tbb=2021.4.0=h4bd325d_1
- tk=8.6.11=h27826a3_1
- tqdm=4.62.3=pyhd8ed1ab_0
- typing_extensions=3.10.0.2=pyha770c72_0
- wheel=0.37.0=pyhd8ed1ab_1
- xz=5.2.5=h516909a_1
- zlib=1.2.11=h36c2ea0_1013
prefix: /data/home/rahulkindi/miniconda3/envs/my_fbgemm_setup
Output of python3 setup.py build develop
: setup.out.txt
Any help you can provide is appreciated!
Could you check if the new OSS build path from #665 can fix this issue?
With #665 fixes the build failures. Howerver, the split_table_batched_embeddings_test and split_embedding_inference_converter_test tests are failing.
% python test/split_table_batched_embeddings_test.py Traceback (most recent call last): File "test/split_table_batched_embeddings_test.py", line 15, in <module> import fbgemm_gpu.split_table_batched_embeddings_ops as split_table_batched_embeddings_ops File "/home/rick/anaconda3/envs/pytorch/lib/python3.8/site-packages/fbgemm_gpu-0.0.1-py3.8-linux-x86_64.egg/fbgemm_gpu/split_table_batched_embeddings_ops.py", line 17, in <module> import fbgemm_gpu.split_embedding_codegen_lookup_invokers as invokers File "/home/rick/anaconda3/envs/pytorch/lib/python3.8/site-packages/fbgemm_gpu-0.0.1-py3.8-linux-x86_64.egg/fbgemm_gpu/split_embedding_codegen_lookup_invokers/__init__.py", line 19, in <module> import fbgemm_gpu.split_embedding_codegen_lookup_invokers.lookup_rowwise_weighted_adagrad as lookup_rowwise_weighted_adagrad # noqa: F401 ModuleNotFoundError: No module named 'fbgemm_gpu.split_embedding_codegen_lookup_invokers.lookup_rowwise_weighted_adagrad'
Should be fixed by #766.