grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why does we need to train for Stage II and Stage III? And why not just train for one stage on the annotated dataset?

chengyang00 opened this issue · comments

I want to know why doing this can improve the performance. Thanks!

My understanding. Stage 1 is synthetic data which is also huge in size, so training is done on that. Stage 2 and 3 use manually annotated and accurate data with the kind of errors humanly made. The data size is tiny compared to the synthetic data. Thus they call it fine tuning and not training.

Thanks for answering this. You're right – different stages have data of different quality.