Having trouble reproducting results in the paper
sqsalt opened this issue · comments
sqsalt commented
Thanks for the amazing idea proposed in your paper.
But I have some trouble reproducting results on SIGHAN15 dataset. I trained a TtT model using HybirdSet train set for almost 40 epoches, achiving 98% ACC on HybirdSet dev set,but performs 0.77 F1. Even worse, it decreases sharply to P/R/F1 0.50/0.76/0.61 on SIGHAN15 test set.
Without detailed training settings or open resource model, I have no idea of which part is failed in my model, so here are the questions:
- what is the ideal training loss on HybirdSet, containing nll_loss and crf loss
- what is the expected ACC/F1 on HybirdSet
- is the weight of fc layer(language model layer) randomly initialed or copied from embedding layer of BERT
Thanks!
Piji Li commented
Will release it before 31 Dec.