grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not getting any corrections on the custom dataset.

alan-ai-learner opened this issue · comments

Getting this problem when im trying bert pretrained model on some custom dataset. Similar to this issue, #131 .
Any help would be great?
I tried this sentence,

I walk to the store and I bought milk.

thanks

Could you specify any information on how you ran the script?
Why do you think that the model must propose edits for this particular sentence?

Could you specify any information on how you ran the script? Why do you think that the model must propose edits for this particular sentence?

  1. I followed these following steps given in the issue #36 :
cd gector
#create conda env
conda create -n gector python=3.7
conda activate gector
pip install torch===1.3.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
# get model
wget https://grammarly-nlp-data-public.s3.amazonaws.com/gector/bert_0_gector.th
# get eval file and inflate
wget https://www.cl.cam.ac.uk/research/nl/bea2019st/data/wi+locness_v2.1.bea19.tar.gz
tar -xzvf wi+locness_v2.1.bea19.tar.gz
# run inference
python predict.py --model_path ./bert_0_gector.th --vocab_path ./data/output_vocabulary/ --input_file wi+locness/test/ABCN.test.bea19.orig  --output_file foo --transformer_model bert --special_tokens_fix 0

and it ran succesfully on this "ABCN.test.bea19.orig " file, and as a result i got the 2300 corrections out of 4700 sentences, they are fine. After that i made a new test.orig file an wrote some incorrect sentences to be precise 2 sentences, and tried to ran it and the code it run successfully. But both the sentences were grammatically incorrect but the result i got are same as the sentences passed in test.orig file.

  1. The two sentences i passed were grammatically incorrect(i checked on grammarly free grammar checker) so i was hoping that model we correct them.

The script parameters look good to me.
Unfortunately, considering the nature of deep learning models, we cannot guarantee that it will correct all the errors.
In Grammarly, there are many different models, and this particular model isn't among them. So you cannot expect that the GECToR model will fix errors in the same way as Grammarly does.

The script parameters look good to me. Unfortunately, considering the nature of deep learning models, we cannot guarantee that it will correct all the errors. In Grammarly, there are many different models, and this particular model isn't among them. So you cannot expect that the GECToR model will fix errors in the same way as Grammarly does.

thanks for your reply, can you suggest few approaches that can be followed to enhance the performance of the model,

I believe a good way to improve model performance would be fine-tuning it on in-domain datasets.
GECToR was trained on rather academic data where essays of English learners were corrected by tutors.

thanks will try that!