Should policy state be reset after every episode?
edwhu opened this issue · comments
It seems like the state of the agent (self._state
) is not initialized to 0 on reset. Only in the very first episode, it is None
, so it will be set to 0s. Since driver.reset()
is never called again in api.py
, self._state
will be carried over from previous episodes on episode reset.
Is this intentional?
dreamerv2/dreamerv2/common/driver.py
Lines 32 to 40 in 07d906e
Yep, the world model resets its state based on the is_first
flag:
dreamerv2/dreamerv2/common/nets.py
Line 94 in 07d906e