Incorrect number of arguments from call to env.step(action)

Question

Incorrect number of arguments from call to env.step(action)

JRod-Seed opened this issue 2 years ago · comments

I just installed gym-super-mario-bros and I've attempted to run the gym using a number of methods including the provided sample code:

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
if done:
state = env.reset()
state, reward, done, info = env.step(env.action_space.sample())
env.render()

env.close()

... as well as from the command line using:

gym_super_mario_bros -e SuperMarioBrosRandomStages-v0 -m human --stages 1-1

In both cases, the gym attempts to render the first frame of the game and then I receive the following error:

/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/envs/registration.py:555: UserWarning: WARN: The environment SuperMarioBrosRandomStages-v0 is out of date. You should consider upgrading to version v3.
logger.warn(
/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:195: UserWarning: WARN: The result returned by env.reset() was not a tuple of the form (obs, info), where obs is a observation and info is a dictionary containing additional information. Actual type: <class 'numpy.ndarray'>
logger.warn(
/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:219: DeprecationWarning: WARN: Core environment is written in old step API which returns one bool instead of two. It is recommended to rewrite the environment with new step API.
logger.deprecation(
Traceback (most recent call last):
File "/Users/--/miniforge3/envs/tf/bin/gym_super_mario_bros", line 8, in
sys.exit(main())
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym_super_mario_bros/_app/cli.py", line 71, in main
play_human(env)
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/nes_py/app/play_human.py", line 69, in play_human
next_state, reward, done, _ = env.step(action)
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/wrappers/time_limit.py", line 50, in step
observation, reward, terminated, truncated, info = self.env.step(action)
ValueError: not enough values to unpack (expected 5, got 4)

My library versions are as follows:

gym 0.26.2
gym-notices 0.0.8
gym-super-mario-bros 7.4.0
nes-py 8.2.1

Running Python 3.9

Any help would be appreciated

JRod · Answer 1 · Tue Oct 11 2022 04:40:24 GMT+0800 (China Standard Time)

It would appear that the issue stems from a mismatch in the expected return from env.step. gym/wrappers/time_limit.py returns 5 parameters (observation, reward, terminated, truncated, info) but the invocation is only expecting 4 (observation, reward, terminated, info)

Likely related to: Kautenja/nes-py#85

Robwc000 · Answer 2 · Wed Oct 12 2022 07:37:04 GMT+0800 (China Standard Time)

I'm having the same issue what is the fix?

Yuze Li · Answer 3 · Mon Oct 24 2022 13:58:19 GMT+0800 (China Standard Time)

Try to use gym==0.25.1, it works for me

Mark Towers · Answer 4 · Wed Oct 26 2022 20:49:04 GMT+0800 (China Standard Time)

With v0.26 you can use this code

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import gym

env = gym.make('SuperMarioBros-v0', apply_api_compatibility=True, render_mode="human")
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
env.reset()
for step in range(5000):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

    if done:
       state = env.reset()

env.close()

You can read about the migration guide for the new API https://gymnasium.farama.org/content/migration-guide/

Holger Szüsz · Answer 5 · Fri Jan 20 2023 00:52:51 GMT+0800 (China Standard Time)

I've debugged the step functions and it seems that nes-py returns 4 or 5 length tuple.
They have a code snippet that checks for if(Len(result)==4) and for 5.
That seems to be the "way to go" it just is not implemented in the gym/wrappers/time_limit.py

I adapted the snipped and set truncated to False as a default:

def step(self, action):
    """Steps through the environment and if the number of steps elapsed exceeds ``max_episode_steps`` then truncate.

    Args:
        action: The environment step action

    Returns:
        The environment step ``(observation, reward, terminated, truncated, info)`` with `truncated=True`
        if the number of steps elapsed >= max episode steps

    """

    truncated = False
    result = self.env.step(action)

    if (len(result) == 4):
        observation, reward, terminated, info = self.env.step(action)
    elif (len(result) == 5):
        observation, reward, terminated, truncated, info = self.env.step(action)
    else:
        raise Exception('Tupel length of step return was neither 4 nore 5. Stop run.')

    self._elapsed_steps += 1

    if self._elapsed_steps >= self._max_episode_steps:
        truncated = True

    return observation, reward, terminated, truncated, info

so instead of just reading 5 every time I do the same check and this works fine.
But I realise changing the lib is bad mojo so maybe this could be a fix ;) with a proper Exception added...