openai / maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Home Page:https://arxiv.org/pdf/1706.02275.pdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training EVERY step, not every 100

eflopez1 opened this issue · comments

Hello,

I wanted to verify something I found in your code. In the method MADDPGAgentTrainer.update() there is a comment next to the following line stating that an update is only allowed to occur every 100 steps: ​

if not t % 100 == 0:  # only update every 100 steps
   ​return

I could be misreading this, but doesn't this line mean that an update will occur every step but skip over steps when t_step%100==0?

t%100 ==0 is true for every 100th step.

Since we have "if not t % 100 ==0: return", this return statement will be executed for all steps except for every 100th step. Therefore only every 100th step, the rest of the update function will be evaluated (hence every 100 steps the update is performed).