gradient checks do not match

Question

gradient checks do not match

vuptran opened this issue 7 years ago · comments

I checked the gradients you derived against the numerical gradients, and your implementation does not match. It looks like the error is in two places:

In calculate_loss, you average the total loss (including the regularization term) over the data batch. The correct implementation should average only the log loss, but not the regularization term.
In build_model, the gradients (dW1, dW2, db1, db2) during backprop should be averaged over the data batch. Again, the correct implementation should not include the regularization terms in the average over the data batch.

Uri Peled · Answer 1 · Mon May 11 2020 17:43:35 GMT+0800 (China Standard Time)

Do you have or know a better implementation?
Can you explain or show to me how you checked it?

Vu Tran · Answer 2 · Mon May 11 2020 23:55:38 GMT+0800 (China Standard Time)

@uripeled2 I have a method for gradient checking in my implementation here: https://github.com/vuptran/introduction-to-neural-networks