Name
RL4MPTD
Overview
This code is for replicating experiments in my SigDial paper
Description
In RL4MPTD, several combinations of reinforcement algorithms and reward functions are applied to learn effective dialogue policy in multi-party trading situation. RL4MPTD provides simulated trading environment for comparing learned policies with random policy or hand-crafted policy.
Demo
TBA
Requirement
TBA
Usage
TBA
Tips
TBA
Contribution
TBA
Licence
TBA