Implement Asynchronous PPO
Dahoas opened this issue Β· comments
π The feature, motivation, and pitch
Implementing an asynchronous PPO mitigates model rollout/exploration as the largest bottleneck in the training process.
Alternatives
No response
Additional context
No response