Add support for device-side globals
eyalroz opened this issue · comments
CUDA supports device-side global variables (see here, for example). We should support running kernels which employ such variables. That means:
- Having the command-line and/or the kernel adapter specify what these are.
- Taking their value via the command-line (buffer-only? Buffer-or-scalar?)
- Registering them with the NVRTC program.
- Copying them "to symbol" before every run of the kernel. Or - copying them once and then making a copy if necessary, depending on whether the variable is read-only or not.
- Optionally copying the global variable values to to the host, to save as outputs (again - what about scalars?)
Also - need to look into these in OpenCL.