Can CuFFT or CuFFTW be used internally in the kernel?

Question

Can CuFFT or CuFFTW be used internally in the kernel?

xxl-cc opened this issue a year ago · comments

Recently, I want to change the Fourier transform I wrote inside the kernel to CuFFT or CuFFTW to test the execution efficiency. Can they be used inside the kernel? I also need to process a large amount of data inside the kernel and perform Fourier transform filtering. But I saw that the example provided is directly using CuFFT.
My example：

public static void FilterProjection(Index1D index, SpecializedValue filterFactorCount, Int2 projectionDim, ArrayView fftWn, ArrayView ifftWn, ArrayView filterFactor, ArrayView oneproj, ArrayView filteredproj)
{
int i = index.X;
var xdata = new Complex[filterFactorCount];
for (int j = 0; j < filterFactorCount; j++)
{
if (j < projectionDim.Y)
{
xdata[j] = new Complex(oneproj[j * projectionDim.X + i], 0.0f);
}
else
{
xdata[j] = new Complex(0.0f, 0.0f);
}
}
FTransfrom.FFT(xdata, fftWn, filterFactorCount);//Can CuFFT be used instead

        for (int j = 0; j < filterFactorCount; j++) 
        {
            xdata[j] *= filterFactor[j];
        }
        FTransfrom.IFFT(xdata, ifftWn, filterFactorCount);//Can CuFFT be used instead

        for (int j = 0; j < projectionDim.Y; j++)
        {
            filteredproj[j * projectionDim.X + i] = (float)xdata[j].Real;
        }
    }

MoFtZ · Answer 1 · Thu Apr 27 2023 19:50:23 GMT+0800 (China Standard Time)

hi @xxl-cc, unfortunately no, it is not possible to call the CuFFT or CuFFTW functions inside a GPU kernel. This is because those functions themselves will launch one (or more) GPU kernels to perform the calculations.

MoFtZ · Answer 2 · Thu Apr 27 2023 19:58:38 GMT+0800 (China Standard Time)

I was not previously aware of this, but it looks like there is a CuFFTDx library that can be called inside a GPU kernel.
https://docs.nvidia.com/cuda/cufftdx/index.html

It is a C++ header only library, so you would need to port it to C#.

xxl-cc · Answer 3 · Thu Apr 27 2023 20:06:57 GMT+0800 (China Standard Time)

Thank you for your reply.Are all CUDA libraries encapsulated by ILGPU currently unusable in the kernel? After all, frequent external data interaction can significantly reduce performance. We hope that the ILGPU. Algorithms library can become more and more comprehensive, reducing some repetitive coding work.

MoFtZ · Answer 4 · Thu Apr 27 2023 20:19:14 GMT+0800 (China Standard Time)

From ILGPU.Algorithms, the Grid, Group, Vector and Warp extensions can be used within a kernel. All the "bigger" functions like RadixSort are not usable inside a GPU kernel.

The Cuda library bindings, including CuRand, CuBlas, CuFFT etc, are all not usable within a GPU kernel. This is the same in C++.

There is also the LibDevice library that can be used inside a GPU kernel, however, it currently does not support non-Cuda accelerators, including the CPU accelerator used for debugging.

MoFtZ · Answer 5 · Fri May 12 2023 04:47:41 GMT+0800 (China Standard Time)

hi @xxl-cc . I'm closing this ticket for now. If there are is anything else to add, please let us know.