Kautenja / gym-super-mario-bros

An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incorrect number of arguments from call to env.step(action)

JRod-Seed opened this issue · comments

commented

I just installed gym-super-mario-bros and I've attempted to run the gym using a number of methods including the provided sample code:

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
if done:
state = env.reset()
state, reward, done, info = env.step(env.action_space.sample())
env.render()

env.close()

... as well as from the command line using:

gym_super_mario_bros -e SuperMarioBrosRandomStages-v0 -m human --stages 1-1

In both cases, the gym attempts to render the first frame of the game and then I receive the following error:

/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/envs/registration.py:555: UserWarning: WARN: The environment SuperMarioBrosRandomStages-v0 is out of date. You should consider upgrading to version v3.
logger.warn(
/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:195: UserWarning: WARN: The result returned by env.reset() was not a tuple of the form (obs, info), where obs is a observation and info is a dictionary containing additional information. Actual type: <class 'numpy.ndarray'>
logger.warn(
/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:219: DeprecationWarning: WARN: Core environment is written in old step API which returns one bool instead of two. It is recommended to rewrite the environment with new step API.
logger.deprecation(
Traceback (most recent call last):
File "/Users/--/miniforge3/envs/tf/bin/gym_super_mario_bros", line 8, in
sys.exit(main())
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym_super_mario_bros/_app/cli.py", line 71, in main
play_human(env)
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/nes_py/app/play_human.py", line 69, in play_human
next_state, reward, done, _ = env.step(action)
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/wrappers/time_limit.py", line 50, in step
observation, reward, terminated, truncated, info = self.env.step(action)
ValueError: not enough values to unpack (expected 5, got 4)

My library versions are as follows:

gym 0.26.2
gym-notices 0.0.8
gym-super-mario-bros 7.4.0
nes-py 8.2.1

Running Python 3.9

Any help would be appreciated

commented

It would appear that the issue stems from a mismatch in the expected return from env.step. gym/wrappers/time_limit.py returns 5 parameters (observation, reward, terminated, truncated, info) but the invocation is only expecting 4 (observation, reward, terminated, info)

Likely related to: Kautenja/nes-py#85

I'm having the same issue what is the fix?

Try to use gym==0.25.1, it works for me

With v0.26 you can use this code

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import gym

env = gym.make('SuperMarioBros-v0', apply_api_compatibility=True, render_mode="human")
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
env.reset()
for step in range(5000):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

    if done:
       state = env.reset()

env.close()

You can read about the migration guide for the new API https://gymnasium.farama.org/content/migration-guide/

I've debugged the step functions and it seems that nes-py returns 4 or 5 length tuple.
They have a code snippet that checks for if(Len(result)==4) and for 5.
That seems to be the "way to go" it just is not implemented in the gym/wrappers/time_limit.py

I adapted the snipped and set truncated to False as a default:

def step(self, action):
    """Steps through the environment and if the number of steps elapsed exceeds ``max_episode_steps`` then truncate.

    Args:
        action: The environment step action

    Returns:
        The environment step ``(observation, reward, terminated, truncated, info)`` with `truncated=True`
        if the number of steps elapsed >= max episode steps

    """

    truncated = False
    result = self.env.step(action)

    if (len(result) == 4):
        observation, reward, terminated, info = self.env.step(action)
    elif (len(result) == 5):
        observation, reward, terminated, truncated, info = self.env.step(action)
    else:
        raise Exception('Tupel length of step return was neither 4 nore 5. Stop run.')

    self._elapsed_steps += 1

    if self._elapsed_steps >= self._max_episode_steps:
        truncated = True

    return observation, reward, terminated, truncated, info

so instead of just reading 5 every time I do the same check and this works fine.
But I realise changing the lib is bad mojo so maybe this could be a fix ;) with a proper Exception added...