IMPORTANT: upgrade to 1.23
OverLordGoldDragon opened this issue · comments
1.2 and 1.21 use erroneous decay formula, decaying l1
as l2
and vice versa; this is fixed in 1.23. Pardon the mishap.
Keras/TF implementation of AdamW, SGDW, NadamW, Warm Restarts, and Learning Rate multipliers
OverLordGoldDragon opened this issue · comments
1.2 and 1.21 use erroneous decay formula, decaying l1
as l2
and vice versa; this is fixed in 1.23. Pardon the mishap.