vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Home Page:http://docs.cleanrl.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upgrade gym version to 0.26.1

AdityaGudimella opened this issue · comments

Problem Description

Upgrade gym version used in cleanrl from 0.23.1 to 0.25.1

Checklist

Possible Solution

StabeBaselines' ReplayBuffer currently does not support the new format returned by gym.Env.step. Their step api changed from:
obs, rew, done, info = env.step(action) to obs, rew, terminated, truncated, info = env.step(action).
We would need to implement a slightly modified version of the ReplayBuffer in cleanRL itself. Other than this, the changes required are minimal.

I can submit an initial PR with changes required for SAC if you're interested.

Update on the ticket - the current gym master is set to release 0.26.0 which enables obs, rew, terminated, truncated, info = env.step(action) by default.

#424 closes this issue to use gymnasium.