tanhm12 / viRoberta-l6-h384-cased

Small roberta model for Vietnamese with hugginface training scripts

6 layers, 12 attention heads and 384 hidden with whole word masking
Pre-tokenize using underthesea word_tokenize
Using Vietnews train dataset and first 3gb of vi Oscar-corpus.
Pretrained model is now available on huggingface

Fine-tune model for Abstractive summarization task on Vietnews dataset (see abstractive-summarization)

About

Languages

Language:Jupyter Notebook 66.2%Language:Python 33.8%