Question on Pretrained Models
atharvjairath opened this issue · comments
Hi, I wanted to know that in the paper it says
Their GEC sequence-tagging model, called GECToR, is an encoder made up of a pre-trained BERT-like Transformer
Here the pre-trained model is trained on what and from where do you get it?
I am confused if you use a pre-trained BERT model from transformers or if you trained the BERT from scratch and then trained it on Synthetic PIE data and then call it a pre-trained model.
In this particular sentence, we meant simply BERT pre-training.
We use a pre-trained BERT, then train it on Synthetic PIE, and then fine-tune it on human-annotated datasets.
In this particular sentence, we meant simply BERT pre-training. We use a pre-trained BERT, then train it on Synthetic PIE, and then fine-tune it on human-annotated datasets.
Have you taken that pre-trained BERT and other models from HuggingFace and then used them in allenNLP to train it further?
Exactly