'context is destroyed' exception after using CuFFT.

Question

'context is destroyed' exception after using CuFFT.

ebprj21 opened this issue a year ago · comments

I am using your masterpiece very well.
If I had the ability, I would be able to help with anything, but in reality, I only write issue like this.

ILGPU's CuFFT example code is as follows:
(\Samples\AlgorithmsCuFFT\Program.cs)

static void DoForwardPlan(
    CudaAccelerator accelerator,
    CuFFT cufft,
    Complex[] input,
    out Complex[] output)
{
    using (var stream = accelerator.CreateStream() as CudaStream)
    {
        using (var inputBuffer = accelerator.Allocate1D(input))
        {
            using (var outputBuffer = accelerator.Allocate1D<Complex>(input.Length))
            {
                CuFFTException.ThrowIfFailed(
                    cufft.Plan1D(
                        out var plan,
                        input.Length,
                        CuFFTType.CUFFT_Z2Z,
                        batch: 1));
                using (plan)
                {
                    plan.SetStream(stream);
                    CuFFTException.ThrowIfFailed(
                        plan.ExecZ2Z(
                            inputBuffer.View.BaseView,
                            outputBuffer.View.BaseView,
                            CuFFTDirection.FORWARD));

                    output = outputBuffer.GetAsArray1D(stream);
                }
                WorkaroundKnownIssue(accelerator, cufft.API);

                Console.WriteLine("Output Values:");
                for (var i = 0; i < output.Length; i++)
                    Console.WriteLine($"  [{i}] = {output[i]}");
            }
        }
    }
}

In the above code, when 'plan' is Disposed, Context is also destroyed.
Therefore, when outputBuffer is Disposing, 'context is destroyed' exception is thrown.

The issue essentially looks like this.
inducer/pycuda#356

I hope you will consider this issue seriously. please.

ps) BTW, the new features in CUDA v12 are interesting.
CUDA Context-Independent Module Loading

MoFtZ · Answer 1 · Thu Mar 16 2023 09:48:03 GMT+0800 (China Standard Time)

hi @ebprj21. Thanks for reporting this issue.

This looks like an issue with the CUDA SDK. Notice the call to WorkaroundKnownIssue. The sample code limits the workaround to specific versions of the CUDA SDK. The workaround will restore the original CUDA context. If you force the workaround to always run, it should fix your issue.

The CUDA Release Notes mention that in v11.2 and v11.3, there was a known issue:
cuFFT planning and plan estimation functions may not restore correct context affecting CUDA driver API applications.
https://docs.nvidia.com/cuda/archive/11.4.0/cuda-toolkit-release-notes/index.html

Do you know which version of the Cuda SDK you are using? It looks like the bug is there on other SDK versions.

ebprj21 · Answer 2 · Thu Mar 16 2023 13:03:37 GMT+0800 (China Standard Time)

Your assistance was greatly appreciated.

I tested it on two systems. CUDA v11.6, v11.8.
Different version than you mention, but both cause problems.

I guess I have to coding in a way that always call 'WorkaroundKnownIssue' inevitably.

Thank you very much.

ps) I will look forward when ILGPU can use LibDevice in the CUDA v12 environment.