karpathy / micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Issue with zero_grad?

sky87 opened this issue · comments

Hi, unless I'm misunderstanding something, zero_grad in nn.py is zeroing out the gradients on the parameter nodes, but shouldn't it do it on all the nodes in the graph?
Otherwise the inner nodes will keep accumulating them.

My bad, I didn't read the whole file carefully enough :)

@sky87 what did you realize? I have the same question but haven't figured it out.

From the looks of this: #8, this is still a known issue?

@ben-z It's been a few months, if I remember correctly the inner nodes are new Value instances that get recreated every time (see for example all the products and the sum in Neuron#__call__), so you don't need to zero them out. The parameters are the only thing that survives between runs

@ben-z It's been a few months, if I remember correctly the inner nodes are new Value instances that get recreated every time (see for example all the products and the sum in Neuron#__call__), so you don't need to zero them out. The parameters are the only thing that survives between runs

That makes a lot of sense!! Thanks for the explanation.