titu1994 / keras-LAMB-Optimizer

Implementation of the LAMB optimizer for Keras from the paper "Reducing BERT Pre-Training Time from 3 Days to 76 Minutes"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

titu1994/keras-LAMB-Optimizer Stargazers