Cannot reproduce experiment results

Question

Cannot reproduce experiment results

arbaazkhan2 opened this issue 6 years ago · comments

Is this code vastly different from the code used to generate results for the paper?
I cannot reproduce any of the results of the experiments simple_spread, simple_reference, simple_tag even after running for over 2 million iterations. The policy doesn't even look like its getting better. Any tips on this? Has somebody else got it to work?

Further, I don't see the ensemble policy part or the estimating other agents policies part in the (Section 4,2 and 4.3 in the paper) code. Am I missing something?

Johannes Ackermann · Answer 1 · Wed Apr 04 2018 23:24:09 GMT+0800 (China Standard Time)

Further, I don't see the ensemble policy part or the estimating other agents policies part in the (Section 4,2 and 4.3 in the paper) code. Am I missing something?

This was answered in #8, it isn't in this repo.

Ryan Lowe · Answer 2 · Tue Jul 10 2018 07:40:45 GMT+0800 (China Standard Time)

Hi! There was a bug in the code that prevented the sharing of reward in collaborative environments. This should be fixed now! Note that the results will be different from the paper since we refactored the code since publication, but the models should still train.

For the ensemble policies/ estimating other agent's policies, that code was created by Yi Wu. Please contact him if you'd like it to be open-sourced.

Yi Wu · Answer 3 · Wed Nov 20 2019 16:09:58 GMT+0800 (China Standard Time)

For policy ensemble and approximation, I have put the code online for easy access:
https://www.dropbox.com/s/jlc6dtxo580lpl2/maddpg_ensemble_and_approx_code.zip?dl=0