Reseting the grad of weights and biases is not enough
zurtal opened this issue · comments
zurtal commented
In the video "The spelled-out intro to neural networks and backpropagation: building micrograd" you present the following code:
n = MLP(3, [4, 4, 1])
xs = [
[2.0, 3.0, -1.0],
[3.0, -1.0, 0.5],
[0.5, 1.0, 1.0],
[1.0, 1.0, -1.0],
]
ys = [1.0, -1.0, -1.0, 1.0] # desired targets
for k in range(20):
# forward pass
ypred = [n(x) for x in xs]
loss = sum((yout - ygt)**2 for ygt, yout in zip(ys, ypred))
# backward pass
for p in n.parameters():
p.grad = 0.0
loss.backward()
# update
for p in n.parameters():
p.data += -0.1 * p.grad
print(k, loss.data)
However before calling loss.backward()
we should reset the grad for ALL values, not just for n.parameters()
.
Because every iteration of loss.backward()
changes the grad (+=...) for all.
zurtal commented
The issue is not as described, since the MLP is reinitiated ypred = [n(x) for x in xs]