ComputationalRadiationPhysics / xeus-cling-cuda-container

The repository contains container recipes to build the entire stack of Xeus-Cling and Cling including cuda extension with just a few commands.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA Kernel fails at sqrt() or rsqrtf() operations

JerryI opened this issue · comments

I found a strange bug. Let me consider the simplest kernel

__global__ void makezeros(
    float4* spin
) {
    int blockId = blockIdx.x + blockIdx.y * gridDim.x + gridDim.x * gridDim.y * blockIdx.z;
    int gid = blockId * (blockDim.x * blockDim.y) + (threadIdx.y * blockDim.x) + threadIdx.x;
    
    spin[gid].x = 0.001;
}

It works absolutely perfect. However, it is enough to add a square root operation

__global__ void makezeros(
    float4* spin
) {
    int blockId = blockIdx.x + blockIdx.y * gridDim.x + gridDim.x * gridDim.y * blockIdx.z;
    int gid = blockId * (blockDim.x * blockDim.y) + (threadIdx.y * blockDim.x) + threadIdx.x;
    
    float empty = sqrt(3.3f);
    spin[gid].x = 0.001;
}

and the, the array spin will not be affected. However, It doesn't show any errors.
It looks like that this kernel code was not even called.

Found an error code. It is

PTX JIT compilation failed
Error code 78

Every time at sqrt() function or others

I have to verify it with the latest cling version. The container is highly outdated.

My assumption: The path to the CUDA std library is not set for the device compiler.