NVIDIA / warp

A Python framework for high performance GPU simulation and graphics

Home Page:https://nvidia.github.io/warp/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Variable aliasing stops gradient flow.

xuan-li opened this issue · comments

To reproduce:

import warp as wp

wp.init()

@wp.func
def test(a: wp.float32):
    b = a
    if b > 0.0:
        b = a * a
    else:
        b = a * a * a
    return b

@wp.kernel
def test_grad(a: wp.array(dtype=wp.float32), b: wp.array(dtype=wp.float32)):
    tid = wp.tid()
    b[tid] = test(a[tid])

a = wp.array([-1., 2., 3.], dtype=wp.float32, requires_grad=True)
b = wp.array([0., 0., 0.], dtype=wp.float32, requires_grad=True)

wp.launch(test_grad, 1, inputs=[a, b])

b.grad = wp.array([1., 1., 1.], dtype=wp.float32)
wp.launch(test_grad, a.shape[0], inputs=[a, b], adjoint=True, adj_inputs=[None, None])

print(a.grad.numpy())

The result is

[0. 0. 0.]

Changing b = a to b = a*1.0 at the first line of test can work around the bug and the result is correct:

[3. 4. 6.]

Similarly, the following kernel also returns incorrect gradient:

import warp as wp

wp.init()

@wp.kernel
def test_grad(a: wp.array(dtype=wp.float32), b: wp.array(dtype=wp.float32)):
    tid = wp.tid()
    ai = a[tid]
    bi = ai
    b[tid] = ai

a = wp.array([-1., 2., 3.], dtype=wp.float32, requires_grad=True)
b = wp.array([0., 0., 0.], dtype=wp.float32, requires_grad=True)

wp.launch(test_grad, 1, inputs=[a, b])

b.grad = wp.array([1., 1., 1.], dtype=wp.float32)
wp.launch(test_grad, a.shape[0], inputs=[a, b], adjoint=True, adj_inputs=[None, None])

print(a.grad.numpy())

Changing bi=ai to bi = ai * 1.0 can resolve the issue.

Thanks @xuan-li, that is unexpected - @c0d1f1ed I believe this used to work correctly, perhaps it is related to recent changes?

The next release of Warp will have a fix for this. Thanks for the repro cases!

This is fixed as part of Warp 1.1.0!