x_pos value unexpectedly drops while it shoud be increasing (Mario still moving to the right)
rnait opened this issue · comments
Describe the bug
I wrote an AI program that maximizes the position x_pos in the game. After many inconsistencies, I realized that at some point the x_pos value drops from 1274 to 252 while Mario is still moving to the right. It is as if the x_pos gets reinitialized without justification. This creates a lot of issues when the goal of the algorithm is to maximize x_pos instead of the score.
Reproduction Script
This scripts renders the game and detects the moment where x_pos drops without justification
import gym_super_mario_bros
from nes_py.wrappers import JoypadSpace
compacted_moves= [(0, 38), (2, 30), (0, 8), (2, 29), (5, 20), (3, 20), (5, 10), (0, 1), (1, 3), (5, 10), (1, 17), (0, 5), (5, 10), (1, 13), (5, 36), (0, 15), (1, 30), (0, 11), (4, 4), (3, 15), (1, 17), (5, 10), (1, 3), (0, 17), (5, 10), (0, 17), (3, 20), (0, 28), (1, 6), (5, 47), (0, 57), (5, 50), (1, 24), (0, 14), (1, 8), (2, 30), (1, 5), (0, 16), (1, 1), (0, 5), (5, 40), (0, 30), (1, 29), (5, 132), (3, 20), (0, 20), (1, 19), (0, 23), (3, 10), (5, 110), (3, 20), (0, 20), (1, 30), (5, 37), (0, 13), (1, 14), (5, 24), (3, 18), (5, 110), (3, 20), (0, 20), (1, 30), (5, 1), (0, 11), (5, 50), (1, 7), (5, 9), (3, 20), (5, 21), (3, 2), (5, 10), (3, 5), (5, 10), (4, 15), (5, 50), (1, 36), (5, 4), (0, 21), (1, 1), (0, 17), (1, 30)]
def renderMario_withMoves(env, moves,sleepTime=0.001):
env.reset()
env.render()
prevPos=0
xPosList=[]
for i in range(0,len(moves)):
next_state, reward, done, info = env.step(action=moves[i])
pos=info['x_pos']
xPosList.append(pos)
if prevPos-pos>=500: print(f"xpos drops significatively prevXpos={prevPos} current xPos {pos} , i= {i}")
prevPos=pos
env.render()
if done : break
return xPosList
env = gym_super_mario_bros.make("SuperMarioBros-8-4-v1")
env = JoypadSpace(env, [["right"], ["right", "A"],["A"],["left"],["left","A"],['NOOP']])
moves=[]
# Transform the compacted moves into a list of plain moves
for move in compacted_moves:
k=move[0]
repeat=move[1]
moves=moves+[k]*repeat
# render the game to see that Mario is actually moving to the right when the value of x_pos drops
xPosList=renderMario_withMoves(env,moves)
print(xPosList[-50:-40:])
The 50 to 40 last positions x_pos of Mario that are printed are
[1264, 1266, 1267, 1269, 1271, 1273, 1274, 252, 254, 256]
Expected behavior
The printed positions should be something increasing like this
[1264, 1266, 1267, 1269, 1271, 1273, 1274, 1276, 1277, 1280]
Additional context
gym_super_mario_bros 7.3.0
nes_py 8.1.8
gym 0.21.0
Python 3.9.7
windows 10 Pro
8-4 is a puzzle level with circular rooms. You're looping back around.
OK thanks. I should play the game before coding on it :-)