DAMD evaluation
hpsun1109 opened this issue · comments
Hi, thanks for releasing the source code.
However, this log got the much higher result than this paper reported. I don't know the difference between these two settings.
Hi,
The higher result is due to the use of the ground truth belief state for searching the database. The result is comparable to line 7 in Table 2 of our paper (slightly higher since we report the average score of 5 runs in the paper). If you want to reproduce the result of line 12, you need to set "bspn_mode='bspn'" and "use_true_bspn_for_ctr_eval=False".
Thanks for the information.