sandialabs / compadre

Compadre (Compatible Particle Discretization and Remap)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nvcc error : 'ptxas' died due to signal 11 (Invalid memory reference)

Yiltan opened this issue · comments

I am getting the above error when I am trying to build the project. Full error is shown at the bottom of this issue.

I am using a slightly modified advanced-configure-gpu-with-python-with-kokkos-build.sh script to install.

The file is as follows:

#!/bin/bash
#
#
# Script for invoking CMake using the CMakeLists.txt file in this directory.

# Cuda on GPU via Kokkos
# With Python interface
# Standalone Kokkos will be auto-built

# following lines for build directory cleanup
find . ! -name '*.sh' -type f -exec rm -f {} +
find . -mindepth 1 -type d -exec rm -rf {} +

# pick your favorite c++ compiler
MY_CXX_COMPILER=`which mpicxx`

# this will install in your build directory in a folder called install by default
INSTALL_PREFIX="./install"

# GPU specific
SCRIPTPATH="$( cd "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"
NVCC_WRAPPER=$SCRIPTPATH/../kokkos/bin/nvcc_wrapper
NVCC_WRAPPER_DEFAULT_COMPILER=$MY_CXX_COMPILER

# optional CMake variables to pass in are PYTHON_PREFIX and PYTHON_EXECUTABLE
# if they are not passed in, then `which python` is called to determine your
# python executable, and from that sitepackages and libraries are inferred.
cmake \
    -D CMAKE_CXX_COMPILER="$NVCC_WRAPPER" \
    -D CMAKE_INSTALL_PREFIX="$INSTALL_PREFIX" \
    -D Compadre_USE_PYTHON:BOOL=ON \
    -D Compadre_DEBUG:BOOL=OFF \
    -D Compadre_USE_MPI:BOOL=ON \
    -D Kokkos_ENABLE_PTHREAD:BOOL=OFF \
    -D Compadre_USE_CUDA:BOOL=ON \
    -D Kokkos_ARCH_VOLTA70:BOOL=ON \
    -D Kokkos_ARCH_POWER9:BOOL=ON \
    -D Kokkos_ENABLE_CUDA:BOOL=ON \
    -D Kokkos_CUDA_DIR:PATH=$CUDA_SCINET_HOME \
    \
    ..

The output from running the script:

yiltan@power9pc:/scratch/yiltan/compadre/build$ ./advanced-configure-gpu-with-python-with-kokkos-build.sh 
-- The CXX compiler identification is GNU 8.4.0
-- Check for working CXX compiler: /scratch/yiltan/compadre/build/../kokkos/bin/nvcc_wrapper
-- Check for working CXX compiler: /scratch/yiltan/compadre/build/../kokkos/bin/nvcc_wrapper -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- CMAKE_VERSION: 3.16.3
-- Compadre_VERSION: 1.3.3
-- USE_XSDK_DEFAULTS: OFF
-- BUILD_TESTING: OFF
-- BUILD_SHARED_LIBS: ON
-- CMAKE_INSTALL_PREFIX: /scratch/yiltan/compadre/build/install
-- Compadre_DEBUG: OFF
-- Compadre_EXTREME_DEBUG: OFF
-- PYTHON_CALLING_BUILD: OFF
-- Compadre_USE_PYTHON: ON
-- Compadre_USE_MATLAB: OFF
-- Compadre_USE_MPI: ON
-- PYTHON_EXECUTABLE: 
-- Python executable location PYTHON_EXECUTABLE not given. Search made using 'which python'
-- PYTHON_EXECUTABLE: /scratch/yiltan/.conda/envs/compandre/bin/python
-- Trilinos_PREFIX: 
-- KokkosCore_PREFIX: 
-- KokkosKernels_PREFIX: 
-- Kokkos_ENABLE_CUDA: ON
-- Kokkos_ENABLE_OPENMP: ON
-- Kokkos_ENABLE_PTHREAD: OFF
-- Setting default Kokkos CXX standard to 11
-- Setting policy CMP0074 to use <Package>_ROOT variables
-- The project name is: Kokkos
-- Using -std=c++11 for C++11 standard as feature
-- Execution Spaces:
--   Device Parallel: CUDA
--     Host Parallel: OPENMP
--       Host Serial: NONE
-- 
-- Architectures:
--  POWER9
--  VOLTA70
-- Found TPLLIBDL: /usr/lib64/libdl.so  
-- Compadre_USE_CUDA: ON
-- Setting policy CMP0074 to use <Package>_ROOT variables
-- The project name is: KokkosKernels
-- The project name is: KokkosKernels
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found CUDA: /opt/base/cuda/10.2.89 (found version "10.2") 

=======================
KokkosKernels ETI Types
   Devices:  <Cuda,CudaSpace>;<Cuda,CudaUVMSpace>;<OpenMP,HostSpace>
   Scalars:  
   Ordinals: 
   Offsets:  
   Layouts:  

KokkosKernels TPLs
   CUBLAS:      /opt/base/cuda/10.2.89/lib64/libcublas.so
   CUSPARSE:    /opt/base/cuda/10.2.89/lib64/libcusparse.so
=======================

-- Compadre_USE_MPI: ON
-- MPI Enabled: TRUE
-- MPI_CXX_INCLUDE_PATH: /opt/include/openmpi;/opt/include/openmpi/opal/mca/event/libevent2022/libevent;/opt/include/openmpi/opal/mca/event/libevent2022/libevent/include;/opt/include;/opt/include
-- MPI_CXX_LIBRARIES: /opt/lib/libmpi.so
-- Compadre_TESTS: ON
-- Compadre_EXAMPLES: ON
-- Compadre_SEMVER = 1.3.3-sha.4ffecba+00111
-- Found PythonInterp: /scratch/yiltan/.conda/envs/compandre/bin/python (found version "3.7.9") 
-- Found PythonLibs: /scratch/yiltan/.conda/envs/compandre/lib/libpython3.7m.so
-- pybind11 v2.5.dev1
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Failed
-- LTO disabled (not supported by the compiler and/or linker)
-- Configuring done
-- Generating done
-- Build files have been written to: /scratch/yiltan/compadre/build

After running make -j 4

yiltan@power9pc:/scratch/yiltan/compadre/build$  make
Scanning dependencies of target kokkoscore
[  1%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_CPUDiscovery.cpp.o
[  3%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Core.cpp.o
[  4%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Error.cpp.o
[  6%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_ExecPolicy.cpp.o
[  7%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostBarrier.cpp.o
[  9%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostSpace.cpp.o
[ 11%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostSpace_deepcopy.cpp.o
[ 12%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostThreadTeam.cpp.o
[ 14%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_MemoryPool.cpp.o
[ 15%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Profiling_Interface.cpp.o
[ 17%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Serial_Task.cpp.o
[ 19%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_SharedAlloc.cpp.o
[ 20%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Spinwait.cpp.o
[ 22%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Stacktrace.cpp.o
[ 23%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_hwloc.cpp.o
[ 25%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/Cuda/Kokkos_CudaSpace.cpp.o
[ 26%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/Cuda/Kokkos_Cuda_Instance.cpp.o
[ 28%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/Cuda/Kokkos_Cuda_Locks.cpp.o
[ 30%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/Cuda/Kokkos_Cuda_Task.cpp.o
[ 31%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/OpenMP/Kokkos_OpenMP_Exec.cpp.o
[ 33%] Building CXX object kokkos/core/src/CMakeFiles/kokkoscore.dir/OpenMP/Kokkos_OpenMP_Task.cpp.o
[ 34%] Linking CXX shared library libkokkoscore.so
[ 34%] Built target kokkoscore
Scanning dependencies of target kokkoscontainers
[ 36%] Building CXX object kokkos/containers/src/CMakeFiles/kokkoscontainers.dir/impl/Kokkos_UnorderedMap_impl.cpp.o
[ 38%] Linking CXX shared library libkokkoscontainers.so
[ 38%] Built target kokkoscontainers
Scanning dependencies of target kokkoskernels
[ 39%] Building CXX object kokkos-kernels/src/CMakeFiles/kokkoskernels.dir/dummy.cpp.o
[ 41%] Building CXX object kokkos-kernels/src/CMakeFiles/kokkoskernels.dir/impl/tpls/KokkosBlas_Cuda_tpl.cpp.o
[ 42%] Linking CXX shared library libkokkoskernels.so
[ 42%] Built target kokkoskernels
Scanning dependencies of target compadre
[ 44%] Building CXX object src/CMakeFiles/compadre.dir/Compadre_GMLS.cpp.o

nvcc error   : 'ptxas' died due to signal 11 (Invalid memory reference)
make[2]: *** [src/CMakeFiles/compadre.dir/Compadre_GMLS.cpp.o] Error 11
make[1]: *** [src/CMakeFiles/compadre.dir/all] Error 2
make: *** [all] Error 2

I was wondering if you would have any ideas on what could be causing these issues?

Hi @Yiltan, this may be a compiler/driver/architecture issue for CUDA (see https://forums.developer.nvidia.com/t/nvcc-error-ptxas-died-due-to-signal-11-invalid-memory-reference/32337).

Can you confirm that Kokkos_ARCH_VOLTA70 is the appropriate architecture for your device and that you are able to compile other codes with this configuration?

Hi @kuberry, I did see that forum post so I thought there may be an issue with my configuration.

With this configuration I am able to compile and run OpenMPI + UCX, and various other applications. This is the configuration used by our cluster.

Bellow is the architecture, I think that I selected the correct flag for the build.

yiltan@power9pc:/scratch/yiltan/compadre/build$ nvidia-smi 
Wed Mar  3 16:56:57 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000004:04:00.0 Off |                    0 |
| N/A   29C    P0    51W / 300W |   1024MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000004:05:00.0 Off |                    0 |
| N/A   28C    P0    38W / 300W |     11MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000035:03:00.0 Off |                    0 |
| N/A   26C    P0    38W / 300W |     11MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000035:04:00.0 Off |                    0 |
| N/A   29C    P0    39W / 300W |     11MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
yiltan@power9pc:/scratch/yiltan/compadre/build$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Thu_Oct_24_17:58:26_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

I have a very similar setup. Tesla V100's, same ram, different wattage, same driver version and CUDA version. Are you sure that you also should be building for POWER9?

I think I selected the correct architecture. I also tried building it without the POWER9 flag and received the same issue.

yiltan@power9pc:/scratch/yiltan/compadre/build$ cat /proc/cpuinfo 
processor	: 0
cpu		: POWER9, altivec supported
clock		: 3800.000000MHz
revision	: 2.3 (pvr 004e 1203)
.
.
<removed to make output smaller>
.
.
processor	: 127
cpu		: POWER9, altivec supported
clock		: 3800.000000MHz
revision	: 2.3 (pvr 004e 1203)

timebase	: 512000000
platform	: PowerNV
model		: 8335-GTH
machine		: PowerNV 8335-GTH
firmware	: OPAL
MMU		: Radix

I think that there is an issue with that cuda version on this system. I downgraded to CUDA 10.1.243 and was able to compile it.

Thank you for the help

Sorry to hear hear there is an issue with that cuda version, but I'm glad you found a solution. Thank you for the update!