training-BERT-from-scratch

RoBERTa: A Robustly Optimized BERT Pretraining Approach. It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining objective and training with much larger mini-batches and learning rates.

Current Status

Credits

Maintained by

👨‍🎓 Kuldeep Singh Sidhu

Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/

Contributors

😎 The full list of all the contributors is available here

Say Thanks

😊 If this helped you in any way, it would be great if you could share it with others.

About

RoBERTa: A Robustly Optimized BERT Pretraining Approach. It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining objective and training with much larger mini-batches and learning rates.

bert training-bert roberta transformer

GNU General Public License v3.0

Languages

Language:Jupyter Notebook 100.0%