NVIDIA / warp

A Python framework for high performance GPU simulation and graphics

Home Page:https://nvidia.github.io/warp/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Index to array can only be int

jc211 opened this issue · comments

This was mentioned on the discord but I've run into this enough times that I think its worth tracking as an issue on github.

The index to a warp array can only be an int. This I think should be more flexible and it should allow uints of all forms.

Example:

@wp.kernel
def test(data: wp.array(dtype=wp.float32)):
    idx = wp.uint32(2)
   data[idx] = wp.float32(2.0) # does not work
   data[int(idx)] = wp.float32(0.0) # does work

Implicit casting is not supported in warp but the indexes should be an exception I think.

The same is true of range(). Would be good if for i in range(idx) worked if idx is uint32

Mostly a note to myself: This is going to require careful investigation. While two's complement arithmetic works exactly the same for signed and unsigned values, index values in the range of 0x80000000 and 0xFFFFFFFF access very different memory locations based on whether it's interpreted as signed or unsiged due to sign- or zero-extension when performing 64-bit pointer arithmetic. Translation through CUDA requires several layers of the compilation stack to be aware of the semantic differences and conventions that are adhered to (or not). I wouldn't be surprised if we encountered a bug along the way which is not easy to work around.

We need to look into support for 64-bit indexing at the same time. While this is theoretically simpler, we already rely on the 32-bit to 64-bit conversion to happen implicitly. It seems better to make it explicit, but also check what impact it may have on performance and security, if any.

I think we can kick this down the road for a bit. It's a good suggestion and would add convenience and the ability to address more than 2 giga-elements, but the workaround of forcing signed 32-bit arithmetic seems okay for practical purposes for now.