Variable aliasing stops gradient flow.
xuan-li opened this issue · comments
To reproduce:
import warp as wp
wp.init()
@wp.func
def test(a: wp.float32):
b = a
if b > 0.0:
b = a * a
else:
b = a * a * a
return b
@wp.kernel
def test_grad(a: wp.array(dtype=wp.float32), b: wp.array(dtype=wp.float32)):
tid = wp.tid()
b[tid] = test(a[tid])
a = wp.array([-1., 2., 3.], dtype=wp.float32, requires_grad=True)
b = wp.array([0., 0., 0.], dtype=wp.float32, requires_grad=True)
wp.launch(test_grad, 1, inputs=[a, b])
b.grad = wp.array([1., 1., 1.], dtype=wp.float32)
wp.launch(test_grad, a.shape[0], inputs=[a, b], adjoint=True, adj_inputs=[None, None])
print(a.grad.numpy())
The result is
[0. 0. 0.]
Changing b = a
to b = a*1.0
at the first line of test
can work around the bug and the result is correct:
[3. 4. 6.]
Similarly, the following kernel also returns incorrect gradient:
import warp as wp
wp.init()
@wp.kernel
def test_grad(a: wp.array(dtype=wp.float32), b: wp.array(dtype=wp.float32)):
tid = wp.tid()
ai = a[tid]
bi = ai
b[tid] = ai
a = wp.array([-1., 2., 3.], dtype=wp.float32, requires_grad=True)
b = wp.array([0., 0., 0.], dtype=wp.float32, requires_grad=True)
wp.launch(test_grad, 1, inputs=[a, b])
b.grad = wp.array([1., 1., 1.], dtype=wp.float32)
wp.launch(test_grad, a.shape[0], inputs=[a, b], adjoint=True, adj_inputs=[None, None])
print(a.grad.numpy())
Changing bi=ai
to bi = ai * 1.0
can resolve the issue.
The next release of Warp will have a fix for this. Thanks for the repro cases!
This is fixed as part of Warp 1.1.0!