liangkai/Reinforcement-Learning-in-Multi-Party-Trading-Dialog

Name

RL4MPTD

Overview

This code is for replicating experiments in my SigDial paper

Description

In RL4MPTD, several combinations of reinforcement algorithms and reward functions are applied to learn effective dialogue policy in multi-party trading situation. RL4MPTD provides simulated trading environment for comparing learned policies with random policy or hand-crafted policy.

Demo

TBA

Requirement

TBA

Usage

TBA

Tips

TBA

Contribution

TBA

Licence

TBA

Author

TakuyaHiroka

liangkai / Reinforcement-Learning-in-Multi-Party-Trading-Dialog