tsoding / nn.h

Simple stb-style header-only library for Neural Networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spiky Cost Trajectories

SamuelSchlesinger opened this issue · comments

In certain training scenarios, I see extremely spiky cost trajectories through training. I bet this could be solved (at least partially) by implementing adagrad or some other adaptive learning rate scheme where the learning rate is adapted per parameter or even adapted at all. I've got an executable in my branch for generating the entire Boolean table of a random function on n bits and you can easily see this behavior with random functions. Here's an example with a 12 bit function

Screenshot 2023-05-19 at 12 58 37 AM

Actually, I implemented something simpler that seems to help a bit. Basically, I dynamically change the rate up or down depending on how much the cost is fluctuating. It's very primitive, but it does the trick. Basically speeds up the rate at first then slows it down at the end.

For the best results, I ended up implementing a running exponential smoothing of the cost change and using that as a regulatory factor for the rate. Otherwise, the spikes can cause a significant lapse in training. Another good idea is to store the very best version of the network found yet and give the ability to revert back to that.