OverLordGoldDragon / keras-adamw

Keras/TF implementation of AdamW, SGDW, NadamW, Warm Restarts, and Learning Rate multipliers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AdaBelief

KochiseBennett opened this issue · comments

Thank you very much for your work on this project! It really is an excellent contribution to provide an up-to-date version of AdamW that allows layer-dependent learning rates. I'm wondering what your thoughts are about AdaBelief and if you'd want to add it as an option to this package?

Glad you found it useful.

No plans on any optimizers, I'm afraid, but layerwise LR's should be easily transferable to others. Further, I've moved to PyTorch and won't be developing any more TensorFlow packages (though I may still fix compatibility bugs for later TF versions).