declare-lab / dialogue-understanding

This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About DialogueRNN performance Prob.

Columbine21 opened this issue · comments

Yours DialogueRNN performance:(in your endtoend implement, Bidirection & attention on emotion strategy are used.)
截屏2020-10-12 上午10 41 22
Original DialogueRNN performance:
1

From Above, your implement of DialogueRNN is worse than the original paper?

There are some difference in the results because the training strategy is somewhat different. The earlier DialogueRNN model was trained in a two-stage setup which generally gives slightly better results. For the new models in this repo, we have implemented all models end-to-end to make the training and all the different kinds of evaluation strategies more flexible.

We observed some variance in the results for the GloVe models in the end-to-end setup. We write about these observations in Section 5, page 12 of our paper. For this reason, we ran each model more than 20 times and average their results in Table 3 and 4.

Thanks a lot.By the way, what about the earlier DialogueRNN model (trained in two-stage setup), would you like share the W-avg F1 score on IEMOCAP dataset?

For the earlier two-stage DialogueRNN model, the W-Avg F1 score is 62.75 in IEMOCAP.

Thanks for your reply! I read your survey. It really helps.