Something wrong about accuracy.

Question

Something wrong about accuracy.

xc15071347094 opened this issue 4 years ago · comments

Dear author
Hello, I am currently reproducing the relevant experiments in your paper. According to the guidance information on the README, when running the file of "pipeline_PlanEnc.sh" , the accuracy is always zero. I have checked a lot of relevant information, but I still can't find a solution. May I ask if this effect is caused by the omission of some important steps.
The following is the relevant information in the period of running:

Loading train dataset from data/rnn_notdelex_exp.train.1.pt, number of examples: 18071
Epoch 5, 50/ 283; acc: 0.00; ppl: 3.76; xent: 1.32; 20891 src tok/s; 16978 tgt tok/s; 5 s elapsed
Epoch 5, 100/ 283; acc: 0.00; ppl: 3.72; xent: 1.31; 20883 src tok/s; 17391 tgt tok/s; 9 s elapsed
Epoch 5, 150/ 283; acc: 0.00; ppl: 3.74; xent: 1.32; 21349 src tok/s; 17339 tgt tok/s; 14 s elapsed
Epoch 5, 200/ 283; acc: 0.00; ppl: 3.43; xent: 1.23; 20920 src tok/s; 17546 tgt tok/s; 18 s elapsed
Epoch 5, 250/ 283; acc: 0.00; ppl: 3.36; xent: 1.21; 20454 src tok/s; 17024 tgt tok/s; 23 s elapsed
Train perplexity: 3.58627
Train accuracy: 0.0000
Loading valid dataset from data/rnn_notdelex_exp.valid.1.pt, number of examples: 2262
Validation perplexity: 3.36388
Validation accuracy: 0.0000

Loading train dataset from data/rnn_notdelex_exp.train.1.pt, number of examples: 18071
Epoch 6, 50/ 283; acc: 0.00; ppl: 3.25; xent: 1.18; 2007 src tok/s; 1631 tgt tok/s; 48 s elapsed
Epoch 6, 100/ 283; acc: 0.00; ppl: 3.25; xent: 1.18; 1344 src tok/s; 1119 tgt tok/s; 118 s elapsed
Epoch 6, 150/ 283; acc: 0.00; ppl: 3.24; xent: 1.18; 1388 src tok/s; 1127 tgt tok/s; 189 s elapsed
Epoch 6, 200/ 283; acc: 0.00; ppl: 3.08; xent: 1.12; 1348 src tok/s; 1131 tgt tok/s; 257 s elapsed
Epoch 6, 250/ 283; acc: 0.00; ppl: 3.05; xent: 1.12; 1317 src tok/s; 1096 tgt tok/s; 326 s elapsed
Train perplexity: 3.17113
Train accuracy: 0.0000
Loading valid dataset from data/rnn_notdelex_exp.valid.1.pt, number of examples: 2262
Validation perplexity: 3.08522
Validation accuracy: 0.0000

I am looking forward to your reply at your earliest convenience. Thank you very much.

zhaochaocs · Answer 1 · Tue Dec 01 2020 12:37:06 GMT+0800 (China Standard Time)

Hello, thanks for your interest in our work. I do not remember the details of the log, but it seems the ppl is decreasing as expected, indicating that the decoder is becoming better and better. The unexpected acc value might be caused by a bug from the evaluation script. I guess you can safely ignore that. We do not use acc to evaluate the quality of a decoder.

Winston Anderson · Answer 2 · Tue Dec 01 2020 12:47:42 GMT+0800 (China Standard Time)

Dear @zhaochaocs
Thank you very much for your timely reply. Can the problem of "acc=0" be solved? In addition, does this issue affect the final model that generated during the training process?

zhaochaocs · Answer 3 · Tue Dec 01 2020 13:36:06 GMT+0800 (China Standard Time)

Yes, it can be solved once you've decided how to evaluate the accuracy of the generated texts. One straightforward way is to check the token overlap between the predicted text and the reference, but I don't think it can give us better guidance than ppl. I don't think it will affect the final model.

Winston Anderson · Answer 4 · Tue Dec 01 2020 13:47:06 GMT+0800 (China Standard Time)

Dear @zhaochaocs
Thank you very much for your kind help. I will have a try on this experiment.