The result is not that ideal like the paper showed

Question

The result is not that ideal like the paper showed

Jarvis-K opened this issue 6 years ago · comments

I just run maddpg in simple_speaker_listener several times,but none of them get the -20 avg-reward like the paper proposed. Are there anything i should modify to get a better or more stable result?

Agor Maxime · Answer 1 · Wed Nov 21 2018 22:42:18 GMT+0800 (China Standard Time)

Looks like you're not the only one having trouble reproducing some results: #12

Bolun Dai · Answer 2 · Sat Jun 08 2019 02:35:26 GMT+0800 (China Standard Time)

I am getting -60 rewards, is that normal for just running the code without any alternations?

Ken · Answer 3 · Thu Nov 14 2019 13:33:50 GMT+0800 (China Standard Time)

Also, in scenario=simple_speaker_listener, this code cannot converge to the result reported in Fig.4. Anyone knows the problem?