Question on Pretrained Models

Question

Question on Pretrained Models

atharvjairath opened this issue 2 years ago · comments

Hi, I wanted to know that in the paper it says

Their GEC sequence-tagging model, called GECToR, is an encoder made up of a pre-trained BERT-like Transformer

Here the pre-trained model is trained on what and from where do you get it?
I am confused if you use a pre-trained BERT model from transformers or if you trained the BERT from scratch and then trained it on Synthetic PIE data and then call it a pre-trained model.

Alex Skurzhanskyi commented 2 years ago

Exactly

Alex Skurzhanskyi · Answer 1 · Sat Mar 26 2022 00:41:02 GMT+0800 (China Standard Time)

In this particular sentence, we meant simply BERT pre-training.
We use a pre-trained BERT, then train it on Synthetic PIE, and then fine-tune it on human-annotated datasets.

Atharv jairath · Answer 2 · Sat Mar 26 2022 00:47:18 GMT+0800 (China Standard Time)

In this particular sentence, we meant simply BERT pre-training. We use a pre-trained BERT, then train it on Synthetic PIE, and then fine-tune it on human-annotated datasets.

Have you taken that pre-trained BERT and other models from HuggingFace and then used them in allenNLP to train it further?