Kautenja / gym-super-mario-bros

An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Possibly can't double jump?

zachoines opened this issue · comments

I cannot get the "Double Jump" behavior to happen in the action space. This behavior is produced by holding down "A" and then releasing. However, I can only seem to press "A" in the simulator. This results in Mario just getting stuck at a tall pipe. Is there a way to produce the double jump behavior?

I'm not sure what you mean; there is no double jump in SMB unless you're talking about some sort of frame perfect glitch type of thing. Jumps only vary according to how long A is held for. For instance, pressing and releasing A for a single frame produces the shortest jump, whereas holding A for some number of frames before releasing produces a higher jump. How are you interacting with the environment currently? Which pipe are you talking about exactly? And, can you make a screen recording that demonstrates the issue?

So this is in the context of a very simple AC2 implementation. Where I feed a state to a NN and retrieve an action. I take this action. I've tried using SIMPLE_MOVEMENT and COMPLEX_MOVEMENT from gym_super_mario_bros.actions. I cant get Mario to jump any higher than the third pipe of the first level (so only a small jump). How do you simulate continued pressing of the "A" to produce a long jump?
image

if not self._done:
   [action] = network.step([state])
   [s_t], reward, d, _ = env.step(action)

The above code is a snippet of code that is run on every state (no frame skipping or stacking currently). I've tried editing gym_super_mario_bros.actions to get different behaviors. What combinations of the "A" action result in a long jump? I'd be very appreciative if you can spot what i'm doing wrong.

Ah, I see the issue. the actions are just combinations of buttons, so a sequence of calls to step where all the actions input have "A" in their combination effectively simulates the "hold A" behavior (in pseudo):

Single Jump

step("A")
step("NOP")

Extended Jump

step("A")
step("A + right")
...
step("NOP")

(also, the last step doesn't have to be nop, it's arbitrary)


Store the outputs of the model during the phase where Mario is stuck at the pipe and see if the agent is outputting sequences of "A" presses. The "extended jump" requires some number of held A frames, I don't know how many off the top of my head though -- presumably the height is a function of how long A is held.

This looks like the solution. Let me give it a try and report back with findings here.