Get GPT loss to decrease to 0 for single batch
bclarkson-code opened this issue · comments
bclarkson-code commented
To make sure that everything is working, we should be able to drop the loss to 0 on a single batch for the model. If it doesn't then there are some bugs that need fixing