grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

APPEND won't happen in some cases

linhkid opened this issue · comments

Let say, for example, my corrupt sentence is

"A B C"

I would like to replace "A" with "D E"

But the result is: "D B C", it should be "D E B C"

Checked my .m2 training data, it has them all there but when I tried to predict, the other token "E" is always gone.

I checked during the inference (the variable "sugg_token" and there is no tokens or actions for the word "E")

What could be the reason? I can fix it myself but it might take a long time though. Appreciate any helps!

Yes, that's true. This is because of the limitation of our architecture. During 1 iteration, we can predict only 1 action per token. That's why we remove other tags during preprocessing. I would suggest splitting this example into two.

Ok thanks, or maybe I can just add an underline between them, then remove in postprocess

I'm not sure if this is a good solution. Such a tag will be very rare and won't have enough examples (if I understand the nature of your task).