Is it possible to allocate memory on the GPU for a single value and reclaim it after a kernel call?
l3utterfly opened this issue · comments
For example, I have a simple CUDA kernel which counts non-zero elements. It will not be run in parallel, only on one thread:
__global__ void countNonZeroElements(const float *input, int input_length, int *non_zero_count) {
int count = 0;
for (int i = 0; i < input_length; i++) {
if (input[i] != 0) {
count++;
}
}
*non_zero_count = count;
}
The I want to read the non_zero_count
variable after calling the kernel function.
I only see ways to allocate and reclaim a CudaSlice
.
You could just allocate a cuda slice with one value.
let mut non_zero_count = dev.alloc_zeros::<i32>(1);
There's no way to pass pointers to primitive types to cuda kernels atm for a few reasons:
- It'd have to be in some shared host/gpu memory
- It'd be unsound because you could mutate the primitive on rust side while the kernel is running. This isn't possible with CudaSlice (unless you're using CudaStream improperly) because all the kernels are executed sequentially on a single stream.
Ok, got it. Thank you.
…On Sat, 29 Apr 2023 at 3:48 am, Corey Lowman ***@***.***> wrote:
You could just allocate a cuda slice with one value.
let mut non_zero_count = dev.alloc_zeros::<i32>(1);
There's no way to pass pointers to primitive types to cuda kernels atm for
a few reasons:
1. It'd have to be in some shared host/gpu memory
2. It'd be unsound because you could mutate the primitive on rust side
while the kernel is running. This isn't possible with CudaSlice (unless
you're using CudaStream improperly) because all the kernels are executed
sequentially on a single stream.
—
Reply to this email directly, view it on GitHub
<#139 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVFX245LMIEDOVJURPXACLXDQNHLANCNFSM6AAAAAAXPLU4UM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
Sure thing, closing for now 👍