Why the reward function must be called before updating the observation in PointWiseEnv.py?

Question

2017040264 opened this issue 2 years ago · comments

Shouldn't the reward be an evaluation of the observed new state S(t+1)?
For example，openai_gym_cartpole