openai / maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Home Page:https://arxiv.org/pdf/1706.02275.pdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The result is not that ideal like the paper showed

Jarvis-K opened this issue · comments

I just run maddpg in simple_speaker_listener several times,but none of them get the -20 avg-reward like the paper proposed. Are there anything i should modify to get a better or more stable result?

Looks like you're not the only one having trouble reproducing some results: #12

I am getting -60 rewards, is that normal for just running the code without any alternations?

commented

Also, in scenario=simple_speaker_listener, this code cannot converge to the result reported in Fig.4. Anyone knows the problem?