About DialogueRNN performance Prob.

Question

About DialogueRNN performance Prob.

Columbine21 opened this issue 4 years ago · comments

Yours DialogueRNN performance:（in your endtoend implement, Bidirection & attention on emotion strategy are used.）

Original DialogueRNN performance:

From Above, your implement of DialogueRNN is worse than the original paper?

Deepanway · Answer 1 · Mon Oct 12 2020 11:59:04 GMT+0800 (China Standard Time)

There are some difference in the results because the training strategy is somewhat different. The earlier DialogueRNN model was trained in a two-stage setup which generally gives slightly better results. For the new models in this repo, we have implemented all models end-to-end to make the training and all the different kinds of evaluation strategies more flexible.

We observed some variance in the results for the GloVe models in the end-to-end setup. We write about these observations in Section 5, page 12 of our paper. For this reason, we ran each model more than 20 times and average their results in Table 3 and 4.

Ziqi Yuan · Answer 2 · Mon Oct 12 2020 15:03:04 GMT+0800 (China Standard Time)

Thanks a lot.By the way, what about the earlier DialogueRNN model (trained in two-stage setup), would you like share the W-avg F1 score on IEMOCAP dataset?

Deepanway · Answer 3 · Mon Oct 12 2020 16:55:15 GMT+0800 (China Standard Time)

For the earlier two-stage DialogueRNN model, the W-Avg F1 score is 62.75 in IEMOCAP.

Ziqi Yuan · Answer 4 · Mon Oct 12 2020 17:04:04 GMT+0800 (China Standard Time)

Thanks for your reply! I read your survey. It really helps.