RobertTLange / gymnax

RL Environments in JAX 🌍

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BernoulliBandit observation space bounds are incorrect when time normalisation is enabled.

jaronsgit opened this issue · comments

normalize_time: bool = True results in the number of steps being normalised between -1 and 1, while the observation space bounds are 0 and params.max_steps_in_episode = 100.

Thank you so much @jaronsgit -- it is merged and will be part of the next release. Cheers, Rob