Adjoint of Matmul
moradza opened this issue · comments
Alireza Moradzadeh commented
The current matmul and batched_matmul implementations break the gradient flow, as an example, in line
Line 3177 in 79a56a9
one should change
a.ptr
with ctypes.c_void_p(adj_a.ptr)
and beta should set to 1. In the current implementation, old gradient of a
is overwritten by matmul call, similar issue holds for b
, c
, and d
.Miles Macklin commented
Hi @moradza, thanks for the report - @daedalus5 can you take a look?
Nicolas Capens commented
Fixed by 9ae3877