fjarri / reikna

Pure Python GPGPU library

Home Page:http://reikna.publicfields.net/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

current example of R2R FFT?

tlambert03 opened this issue · comments

From what I understand, FFT only does C2C... which is fine. I'm looking to create a function that mimics the api of scipy.fft.rfft in terms of inputs and outputs (even if the actual FFT operates on complex). I found this example transformation from 2013: #9 (comment) ... and can work off that, but I'm wondering if you have any more recent tips or examples on how you'd accomplish that?

thanks!

edit: I suspect I can use split_complex transformation ... but still learning the patterns.

This example should work: https://github.com/fjarri/reikna/blob/develop/examples/demo_real_to_complex_fft.py . It only describes how to make real-to-complex FFT, but making real-to-real by analogy should be relatively straightforward (if it isn't, do comment here - I know Reikna's API may be unfriendly sometimes).

As for split_complex, it does not discard the complex part; you would want to use it if you wanted to switch from interleaved representation to keeping real and complex parts in separate arrays. I think the cast transformation should work here, both ways (real to complex and complex to real)

Thanks for the prompt reply @fjarri! much appreciated...

I'm still struggling to connect the parameters correctly (sorry :/). here's what I've tried:

this works great for R2C, as shown in the example:

def tform_r2c(arr):
    complex_dtype = cluda.dtypes.complex_for(arr.dtype)
    return Transformation(
        [
            Parameter("output", Annotation(Type(complex_dtype, arr.shape), "o")),
            Parameter("input", Annotation(arr, "i")),
        ],
        """
        ${output.store_same}(
            COMPLEX_CTR(${output.ctype})(
                ${input.load_same},
                0));
        """,
    )


r2c = tform_r2c(arr)  # arr is float32
plan = FFT(r2c.output, axes=axes)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
# good to go

btw, if I try to use cast instead of tform_r2c above, I get a stride mismatch (I suspect i haven't done it correctly):

r2c = cast(arr, cluda.dtypes.complex_for(arr.dtype))
plan = FFT(r2c.output, axes=axes)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation.py:477: in <listcomp>
    self._get_kernel_argobject(name, self.root_parameters[name].annotation)
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation.py:463: in _get_kernel_argobject
    store_idx, store_same, store_combined_idx = self._get_connection_modules(
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation.py:440: in _get_connection_modules
    m_idx = module_leaf_macro(output, param)
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation_modules.py:162: in module_leaf_macro
    index_expr=flat_index_expr(param)))
E           ValueError: Some of the strides (4096, 4)are not multiples of the itemsize8

On the output, I tried this C2R transform:

def tform_c2r(arr):
    """Transform a complex array to a real one by discarding the imaginary part."""
    real_dtype = cluda.dtypes.real_for(arr.dtype)
    return Transformation(
        [
            Parameter("output", Annotation(Type(real_dtype, arr.shape), "o")),
            Parameter("input", Annotation(arr, "i")),
        ],
        """
        ${output.store_same}(${input.load_same}.x);
        """,
    )

and I suspect I'm not connecting it right:

r2c = tform_r2c(arr)
plan = FFT(r2c.output, axes=axes)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
c2r = tform_c2r(plan.parameter.output)
plan.parameter.output.connect(c2r, c2r.output, new_output=c2r.input)
ValueError: Incompatible types of the transformation parameter 'output' (Type(float32, shape=(128, 128))) and the node 'output' (Type(complex64, shape=(128, 128)))

(intuitively it feels like I've got the output connection backwards... but I can't figure out the syntax to fix it)

Thanks for trying out cast(), I think you discovered a bug there. Or, rather, there's a bug in Type.with_dtype(), which does not change the strides when the itemsize changes. By the way, sometimes during a debug it may be convenient to print out the .signature property of a transformation or a computation, to see what they actually expect as arguments.

You were on the right track with the manual transformation, except you got the connection bit wrong. It should be:

plan.parameter.output.connect(c2r, c2r.input, new_output=c2r.output)

That is, you're connecting c2r.input to plan.output.

thanks a lot! got it working... full R2R example in details below for anyone who stumbles on this:

from reikna.transformations import Annotation, Parameter, Transformation
import reikna.cluda as cluda
from reikna.core import Type
from reikna.fft import FFT
import numpy as np

api = cluda.ocl_api()
thr = api.Thread.create()


def tform_r2c(arr):
    complex_dtype = cluda.dtypes.complex_for(arr.dtype)
    return Transformation(
        [
            Parameter("output", Annotation(Type(complex_dtype, arr.shape), "o")),
            Parameter("input", Annotation(arr, "i")),
        ],
        """
        ${output.store_same}(
            COMPLEX_CTR(${output.ctype})(
                ${input.load_same},
                0));
        """,
    )


def tform_c2r(arr):
    """Transform a complex array to a real one by discarding the imaginary part."""
    real_dtype = cluda.dtypes.real_for(arr.dtype)
    return Transformation(
        [
            Parameter("output", Annotation(Type(real_dtype, arr.shape), "o")),
            Parameter("input", Annotation(arr, "i")),
        ],
        """
        ${output.store_same}(${input.load_same}.x);
        """,
    )

arr = np.random.rand(8).astype(np.float32)

r2c = tform_r2c(arr)
plan = FFT(r2c.output)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
c2r = tform_c2r(plan.parameter.output)
plan.parameter.output.connect(c2r, c2r.input, new_output=c2r.output)
planc = plan.compile(thread=thr)

arr_dev = thr.to_device(arr)
out_dev = thr.array(arr.shape, np.float32)
planc(out_dev, arr_dev)

assert np.allclose(out_dev.get(), np.fft.fft(arr).real)

<\details>

I'll close this issue ... but if I may ask you one more question:
as far as you know, are there any other python opencl FFT implementations out there? I have everything i need in reikna, so I'm not looking for an alternative... just want to confirm my perception that reikna's OCL implementation is still the "go-to" source for a GPU FFT without using CUDA)

The only one I know of is https://github.com/geggo/gpyfft, which is based on https://github.com/clMathLibraries/clFFT. I have not used it myself though, so I don't know how the performance compares (but I wouldn't be surprised if it's better).

thanks again for your help