suite_gym.load('CartPole-v1', max_episode_steps=10000) does not modify the default max_episode_steps

Question

suite_gym.load('CartPole-v1', max_episode_steps=10000) does not modify the default max_episode_steps

fede72bari opened this issue a year ago · comments

When training a simple CartPole environment trying to extend the default max_episode_steps as following

env = suite_gym.load('CartPole-v1', max_episode_steps=10000)
env = tf_py_environment.TFPyEnvironment(env)

according to https://www.tensorflow.org/agents/api_docs/python/tf_agents/environments/suite_gym/load#args

tf_agents.environments.suite_gym.load(
    environment_name: Text,
    discount: [tf_agents.typing.types.Float](https://www.tensorflow.org/agents/api_docs/python/tf_agents/typing/types/Float) = 1.0,
    max_episode_steps: Optional[types.Int] = None,
    gym_env_wrappers: Sequence[[tf_agents.typing.types.GymEnvWrapper](https://www.tensorflow.org/agents/api_docs/python/tf_agents/typing/types/GymEnvWrapper)] = (),
    env_wrappers: Sequence[[tf_agents.typing.types.PyEnvWrapper](https://www.tensorflow.org/agents/api_docs/python/tf_agents/typing/types/PyEnvWrapper)] = (),
    spec_dtype_map: Optional[Dict[gym.Space, np.dtype]] = None,
    gym_kwargs: Optional[Dict[str, Any]] = None,
    render_kwargs: Optional[Dict[str, Any]] = None
) -> [tf_agents.environments.PyEnvironment](https://www.tensorflow.org/agents/api_docs/python/tf_agents/environments/PyEnvironment)

I get something that is still limited to 500 steps while I would like to extend it to 10000 steps. In the training code, I reset the environment when

    time_step = environment.current_time_step()    
    if time_step.is_last():
        time_step = environment.reset()

hope that is last part was correctly coded.