coreylowman / cudarc

Safe rust wrapper around CUDA toolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the correct way to do device memory manipulation using this library?

l3utterfly opened this issue · comments

I am trying to accomplish the following: append two arrays on device memory.

Normally, the following steps achieves this:

  1. Allocate a new block of memory with size = array1 + array2
  2. Copy all memory from array1 -> newArray
  3. Copy all memory from array2 -> newArray starting at dst memory location *newArray + sizeof(array1[0]) * len(array1)

I'm try to achieve the same with this. I've gotten this far:

  1. let raw_dev_ptr1 = cuda_slice1.leak()
  2. let raw_dev_ptr2 = cuda_slice2.leak()
  3. let new_cuda_slice = dev.alloc(total_size)

This allows me to obtain the raw pointer locations to my data on device.

I'm looking into the function cudarc::driver::result::memcpy_dtod_async, it seems to be what I need. But I have trouble getting the Stream reference.

I see the CudaDevice has a private stream field, which seems to be what I need. But it's private.

What is the correct way to access the stream using this library?

You can use CudaDevice::dtod_copy for this, combined with CudaSlice::slice_mut:

let a = dev.alloc_zeros::<f32>(5);
let b = dev.alloc_zeros::<f32>(5);
let mut c = dev.alloc_zeros::<f32>(10);

dev.dtod_copy(&a, &mut c.slice_mut(0..5));
dev.dtod_copy(&b, &mut c.slice_mut(5..10));

As far as accessing stream, this isn't possible now, but definitely open to making the field public. Need to think more about mixing safe/result APIs. In general all the methods from result have a safe counterpart, so you should be able to use safe api for everything.

Will close this for now, but if you want to add this to examples/02-copy.rs, that'd probably be useful for future!