Add alloc_async function that takes a single element to copy into all the elements
coreylowman opened this issue · comments
This actually seems a little more subtle than I had hoped - it seems like you can only do memset for up to 32 bits with the driver api? A cleaner thing might just be to allocate host memory and then do take_async
yes, memset only works for primitives, iirc even for 64 bit values (but probably still kinda useless)
could there be an unsafe alloc function to just alloc but not zero the memory?
This actually seems a little more subtle than I had hoped - it seems like you can only do memset for up to 32 bits with the driver api? A cleaner thing might just be to allocate host memory and then do take_async
maybe just take the naive approach and loop through the whole region (using async copy this shouldn’t be too bad)
Not going to do this - if someone really wants to do this they can create their own kernel for setting values like they want (did this a couple of places in dfdx)