m4rs-mt / ILGPU

ILGPU JIT Compiler for high-performance .Net GPU programs

Home Page:http://www.ilgpu.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can CuFFT or CuFFTW be used internally in the kernel?

xxl-cc opened this issue · comments

commented

Recently, I want to change the Fourier transform I wrote inside the kernel to CuFFT or CuFFTW to test the execution efficiency. Can they be used inside the kernel? I also need to process a large amount of data inside the kernel and perform Fourier transform filtering. But I saw that the example provided is directly using CuFFT.
My example:

public static void FilterProjection(Index1D index, SpecializedValue filterFactorCount, Int2 projectionDim, ArrayView fftWn, ArrayView ifftWn, ArrayView filterFactor, ArrayView oneproj, ArrayView filteredproj)
{
int i = index.X;
var xdata = new Complex[filterFactorCount];
for (int j = 0; j < filterFactorCount; j++)
{
if (j < projectionDim.Y)
{
xdata[j] = new Complex(oneproj[j * projectionDim.X + i], 0.0f);
}
else
{
xdata[j] = new Complex(0.0f, 0.0f);
}
}
FTransfrom.FFT(xdata, fftWn, filterFactorCount);//Can CuFFT be used instead

        for (int j = 0; j < filterFactorCount; j++) 
        {
            xdata[j] *= filterFactor[j];
        }
        FTransfrom.IFFT(xdata, ifftWn, filterFactorCount);//Can CuFFT be used instead

        for (int j = 0; j < projectionDim.Y; j++)
        {
            filteredproj[j * projectionDim.X + i] = (float)xdata[j].Real;
        }
    }
commented

hi @xxl-cc, unfortunately no, it is not possible to call the CuFFT or CuFFTW functions inside a GPU kernel. This is because those functions themselves will launch one (or more) GPU kernels to perform the calculations.

commented

I was not previously aware of this, but it looks like there is a CuFFTDx library that can be called inside a GPU kernel.
https://docs.nvidia.com/cuda/cufftdx/index.html

It is a C++ header only library, so you would need to port it to C#.

commented

Thank you for your reply.Are all CUDA libraries encapsulated by ILGPU currently unusable in the kernel? After all, frequent external data interaction can significantly reduce performance. We hope that the ILGPU. Algorithms library can become more and more comprehensive, reducing some repetitive coding work.

commented

From ILGPU.Algorithms, the Grid, Group, Vector and Warp extensions can be used within a kernel. All the "bigger" functions like RadixSort are not usable inside a GPU kernel.

The Cuda library bindings, including CuRand, CuBlas, CuFFT etc, are all not usable within a GPU kernel. This is the same in C++.

There is also the LibDevice library that can be used inside a GPU kernel, however, it currently does not support non-Cuda accelerators, including the CPU accelerator used for debugging.

commented

hi @xxl-cc . I'm closing this ticket for now. If there are is anything else to add, please let us know.