mojivalipour / symbolicgpt

Symbolic regression is the task of identifying a mathematical expression that best fits a provided dataset of input and output values. In this work, we present SymbolicGPT, a novel transformer-based language model for symbolic regression.

Home Page:https://git.uwaterloo.ca/data-analytics-lab/symbolicgpt2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bug in cosine learning rate decay?

t-taniai opened this issue · comments

I recalled another issue, which may or may not be a bug.

TrainerConfig has a parameter final_tokens for cosine lr decay, which is set to final_tokens=2*len(train_dataset)*blockSize. To my understanding, it draws a lr curve having many repeated cycles of a cosine function through the training (ie, 1 epoch to decay from 1 to 0, and another epoch to increase from 0 to 1).
I'm not familiar with cos lr, but its proper setting is probably final_tokens=numEpochs*len(train_dataset)*blockSize (ie, a single decay from 1 to 0 (a half cycle of cos(t)+1 ) through the training) ?

Thank you for bringing this up. Hyperparameters of this repository are certainly not optimal. It seems that the best way to address this is to run a tuner and select the hyperparameters that work best. However, I agree that most probably the final_tokens=numEpochs*len(train_dataset)*blockSize is a better choice.