karpathy / micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reseting the grad of weights and biases is not enough

zurtal opened this issue · comments

In the video "The spelled-out intro to neural networks and backpropagation: building micrograd" you present the following code:

n = MLP(3, [4, 4, 1])
xs = [
  [2.0, 3.0, -1.0],
  [3.0, -1.0, 0.5],
  [0.5, 1.0, 1.0],
  [1.0, 1.0, -1.0],
]
ys = [1.0, -1.0, -1.0, 1.0] # desired targets
for k in range(20):
  
  # forward pass
  ypred = [n(x) for x in xs]
  loss = sum((yout - ygt)**2 for ygt, yout in zip(ys, ypred))
  
  # backward pass
  for p in n.parameters():
    p.grad = 0.0
  loss.backward()
  
  # update
  for p in n.parameters():
    p.data += -0.1 * p.grad
  
  print(k, loss.data) 

However before calling loss.backward() we should reset the grad for ALL values, not just for n.parameters().
Because every iteration of loss.backward() changes the grad (+=...) for all.

The issue is not as described, since the MLP is reinitiated ypred = [n(x) for x in xs]