Predict @@UNKNOWN@@ during prediction
tuzeao opened this issue · comments
Hi.
Recently I have been working out with this fantastic model. I generated my data and trained it and made predictions. Everything seems work great.
Finally when I tried check the output, this strange thing happened: In my predictions many edit operations of char is predicted as @@unknown@@, like this:
I dont think something wrong with my training process. I generate source and target sentence, split them to two files, use bert tokenizer to tokenize them, then use preprocess to make them to correct format for train.py.
Though I have only 4 types of edit operations due to apply this model in Chinese, But that's OK for my application scene.
Any ideas on how this would happen? I have checked all the issues and seems like no one has the same situation.
Stucked here like two days so I will so thankful if someone gives some advice.
Ok I figured it out. the gap between training data and labels.txt
if you add your personilized tranforms while forget adding them to the labels.txt, it happens.