Kautenja / gym-super-mario-bros

An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

x_pos value unexpectedly drops while it shoud be increasing (Mario still moving to the right)

rnait opened this issue · comments

commented

Describe the bug

I wrote an AI program that maximizes the position x_pos in the game. After many inconsistencies, I realized that at some point the x_pos value drops from 1274 to 252 while Mario is still moving to the right. It is as if the x_pos gets reinitialized without justification. This creates a lot of issues when the goal of the algorithm is to maximize x_pos instead of the score.

Reproduction Script

This scripts renders the game and detects the moment where x_pos drops without justification

import gym_super_mario_bros
from nes_py.wrappers import JoypadSpace
compacted_moves= [(0, 38), (2, 30), (0, 8), (2, 29), (5, 20), (3, 20), (5, 10), (0, 1), (1, 3), (5, 10), (1, 17), (0, 5), (5, 10), (1, 13), (5, 36), (0, 15), (1, 30), (0, 11), (4, 4), (3, 15), (1, 17), (5, 10), (1, 3), (0, 17), (5, 10), (0, 17), (3, 20), (0, 28), (1, 6), (5, 47), (0, 57), (5, 50), (1, 24), (0, 14), (1, 8), (2, 30), (1, 5), (0, 16), (1, 1), (0, 5), (5, 40), (0, 30), (1, 29), (5, 132), (3, 20), (0, 20), (1, 19), (0, 23), (3, 10), (5, 110), (3, 20), (0, 20), (1, 30), (5, 37), (0, 13), (1, 14), (5, 24), (3, 18), (5, 110), (3, 20), (0, 20), (1, 30), (5, 1), (0, 11), (5, 50), (1, 7), (5, 9), (3, 20), (5, 21), (3, 2), (5, 10), (3, 5), (5, 10), (4, 15), (5, 50), (1, 36), (5, 4), (0, 21), (1, 1), (0, 17), (1, 30)]
def renderMario_withMoves(env, moves,sleepTime=0.001):
    env.reset()
    env.render()
    prevPos=0
    xPosList=[]
    for i in range(0,len(moves)):
        next_state, reward, done, info = env.step(action=moves[i])
        pos=info['x_pos']
        xPosList.append(pos)
        if prevPos-pos>=500: print(f"xpos drops significatively prevXpos={prevPos} current xPos {pos} , i= {i}")
        prevPos=pos
        env.render()
        if done : break
    return xPosList

env = gym_super_mario_bros.make("SuperMarioBros-8-4-v1")
env = JoypadSpace(env, [["right"], ["right", "A"],["A"],["left"],["left","A"],['NOOP']])
moves=[]
# Transform the compacted moves into a list of plain moves
for move in compacted_moves:
    k=move[0]
    repeat=move[1]
    moves=moves+[k]*repeat
# render the game to see that Mario is actually moving to the right when the value of x_pos drops
xPosList=renderMario_withMoves(env,moves)
print(xPosList[-50:-40:])

The 50 to 40 last positions x_pos of Mario that are printed are
[1264, 1266, 1267, 1269, 1271, 1273, 1274, 252, 254, 256]

Expected behavior

The printed positions should be something increasing like this
[1264, 1266, 1267, 1269, 1271, 1273, 1274, 1276, 1277, 1280]

Additional context

gym_super_mario_bros 7.3.0
nes_py 8.1.8
gym 0.21.0
Python 3.9.7
windows 10 Pro

8-4 is a puzzle level with circular rooms. You're looping back around.

commented

OK thanks. I should play the game before coding on it :-)