multilingual transformer roberta bert pytorch xla

multilingual-clf

Data

The data has been used from Kaggle cometion Jigsaw Multilingual Toxic Comment Classification

Workings

Refer to my notebook to see how all of the stuff works out. Kaggle Notebook

Use PyTorch nightly. PyTorch and torch_xla seems to be unstable a lot of times.
bert-multilingual-uncased models works very easily. There are no SIGKILL or other memory issues.
xlm-roberta-base model works too with batch_size=8.
xlm-roberta-large is a lot trickier. Garbage collection, limiting the loading of dataloader to once is required.
- Model needs to be called only once and wrapped with a wrapper function provided in torch_xla library.

Todo

Add Multiple Sample Dropout
Mixed precision training

About

Classification of multilingual dataset trained only on English training data using pre-trained models. Model is trained on TPUs using PyTorch and torch_xla library.

multilingual transformer roberta bert pytorch xla

Languages

Language:Python 100.0%