NVIDIA / warp

A Python framework for high performance GPU simulation and graphics

Home Page:https://nvidia.github.io/warp/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA build failed, when using break in while loop

yuhan-zh opened this issue · comments

commented

Hello,
I tried to write a while loop in a kernel function, and use break to terminate it. But CUDA build failed. Am I not allowed to do it, or it's a bug?

Here is the program to reproduce the error and the error message.

import warp as wp
import numpy as np
wp.init()
num_points = 1024
@wp.kernel
def length(points: wp.array(dtype=wp.vec3),
           lengths: wp.array(dtype=float)):
    tid = wp.tid()
    length = wp.length(points[tid])
    while True:
        if wp.length(points[tid]) > 1:
            length = -1.0
        break
    lengths[tid] = length


points = wp.array(np.random.rand(num_points, 3), dtype=wp.vec3)
lengths = wp.zeros(num_points, dtype=float)
wp.launch(kernel=length,
          dim=len(points),
          inputs=[points, lengths])

print(lengths)
 Warp 0.9.0 initialized:
   CUDA Toolkit: 12.1, Driver: 12.1
   Devices:
     "cpu"    | Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
     "cuda:0" | NVIDIA GeForce RTX 2080 (sm_75)
   Kernel cache: C:\Users\xxx\AppData\Local\NVIDIA Corporation\warp\Cache\0.9.0
Warp NVRTC compilation error 6: NVRTC_ERROR_COMPILATION (C:\Users\xxx\workspace\warp-gitlab\warp\native\warp.cu:1107)
default_program(63): error: label "for_end_0" was referenced but not defined
                goto for_end_0;
                     ^

default_program(102): warning #177-D: variable "adj_3" was declared but never referenced
      bool adj_3(0);
           ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

default_program(105): warning #550-D: variable "adj_6" was set but never used
      int32 adj_6(0);
            ^

default_program(145): error: label "for_end_0" was referenced but not defined
                goto for_end_0;
                     ^
2 errors detected in the compilation of "default_program".
Module __main__ load on device 'cuda:0' took 195.17 ms
Traceback (most recent call last):
  File "c:\Users\xxx\workspace\warp-gitlab\examples\example_test.py", line 32, in <module>
    wp.launch(kernel=length,
  File "C:\Users\xxx\workspace\warp-gitlab\warp\context.py", line 2904, in launch
    if not module.load(device):
  File "C:\Users\xxx\workspace\warp-gitlab\warp\context.py", line 1398, in load
    raise (e)
  File "C:\Users\xxx\workspace\warp-gitlab\warp\context.py", line 1376, in load
    warp.build.build_cuda(
  File "C:\Users\xxx\workspace\warp-gitlab\warp\build.py", line 144, in build_cuda
    raise Exception("CUDA build failed")
Exception: CUDA build failed

Thank you!

Thanks for reporting this! I'll have to take a closer look at why this ends up failing at the CUDA level, but we don't support (potentially) infinite loops at the moment, only range-based loops.

So I'd recommend replacing the while True: with something like for _ in range(1000):. This assumes 1000 iterations is both sufficiently large for the algorithm in question and sufficiently small to prevent excessive runtimes for outlier scenarios, so adjust it to your needs.

commented

@c0d1f1ed I used a variable to decide whether to end the loop or not, this avoids using the break, and it works for now. And the for-loop suggestion also works.
Thank you for the advice!